All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support
@ 2015-02-03 11:45 Bin Meng
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ Bin Meng
                   ` (8 more replies)
  0 siblings, 9 replies; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

This series adds the 2nd step of bare support for the Intel Quark
SoC support which can be validated on Intel Galileo board. It adds
Intel Quark Memory Reference Code (MRC) which is ported from Intel
released UEFI BIOS for Quark, which per its name does the memory
initialization for Quark based board. Now with this patch series,
U-Boot boots to the command shell on Intel Galileo gen2 board.

The MRC port work was mainly replacing all UEFI BIOS APIs with
corresponding U-Boot ones, like register access, timer, debug output
etc, however the majority of the time was spent to converting Intel
coding convention to U-Boot's, like comment block //, space to indent,
camel cases, struct typedefs, etc. It took me almost 4 days to finish
the work, but I have not managed to fix all checkpatch warnings given
the code complexity and black magics in the codes. See commit notes
inside patch#3, #4, #5 about the checkpatch warning details.

Currently only the cold boot path is taken by MRC and it takes about
10 seconds from power-up to U-Boot shell. Future work will be done
to support a fast boot path (MRC cache) which is supposed to boot
much faster. With previous Quark patch series, we have a basic support
for Intel Quark SoC in good shape which more additional feature work
can be based upon in the following weeks.


Bin Meng (9):
  x86: Allow overriding TSC_FREQ_IN_MHZ
  x86: quark: Bypass TSC calibration
  x86: quark: Add Memory Reference Code (MRC) main routines
  x86: quark: Add utility codes needed for MRC
  x86: quark: Add System Memory Controller support
  x86: quark: Enable the Memory Reference Code build
  fdtdec: Add compatible id and string for Intel Quark MRC
  dt-bindings: Add Intel Quark MRC bindings
  x86: quark: Call MRC in dram_init()

 arch/x86/Kconfig                      |   40 +-
 arch/x86/cpu/quark/Kconfig            |    5 +
 arch/x86/cpu/quark/Makefile           |    1 +
 arch/x86/cpu/quark/dram.c             |   97 +-
 arch/x86/cpu/quark/hte.c              |  398 +++++
 arch/x86/cpu/quark/hte.h              |   44 +
 arch/x86/cpu/quark/mrc.c              |  206 +++
 arch/x86/cpu/quark/mrc_util.c         | 1499 ++++++++++++++++++
 arch/x86/cpu/quark/mrc_util.h         |  153 ++
 arch/x86/cpu/quark/smc.c              | 2764 +++++++++++++++++++++++++++++++++
 arch/x86/cpu/quark/smc.h              |  446 ++++++
 arch/x86/dts/galileo.dts              |   25 +
 arch/x86/include/asm/arch-quark/mrc.h |  189 +++
 include/dt-bindings/mrc/quark.h       |   83 +
 include/fdtdec.h                      |    1 +
 lib/fdtdec.c                          |    1 +
 16 files changed, 5930 insertions(+), 22 deletions(-)
 create mode 100644 arch/x86/cpu/quark/hte.c
 create mode 100644 arch/x86/cpu/quark/hte.h
 create mode 100644 arch/x86/cpu/quark/mrc.c
 create mode 100644 arch/x86/cpu/quark/mrc_util.c
 create mode 100644 arch/x86/cpu/quark/mrc_util.h
 create mode 100644 arch/x86/cpu/quark/smc.c
 create mode 100644 arch/x86/cpu/quark/smc.h
 create mode 100644 arch/x86/include/asm/arch-quark/mrc.h
 create mode 100644 include/dt-bindings/mrc/quark.h

-- 
1.8.2.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:24   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 2/9] x86: quark: Bypass TSC calibration Bin Meng
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

We should allow the value of TSC_FREQ_IN_MHZ to be overridden by
the one in arch/cpu/<xxx>/Kconfig.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
---

 arch/x86/Kconfig | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 85dda2e..2370c32 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -348,26 +348,6 @@ config FRAMEBUFFER_VESA_MODE
 
 endmenu
 
-config TSC_CALIBRATION_BYPASS
-	bool "Bypass Time-Stamp Counter (TSC) calibration"
-	default n
-	help
-	  By default U-Boot automatically calibrates Time-Stamp Counter (TSC)
-	  running frequency via Model-Specific Register (MSR) and Programmable
-	  Interval Timer (PIT). If the calibration does not work on your board,
-	  select this option and provide a hardcoded TSC running frequency with
-	  CONFIG_TSC_FREQ_IN_MHZ below.
-
-	  Normally this option should be turned on in a simulation environment
-	  like qemu.
-
-config TSC_FREQ_IN_MHZ
-	int "Time-Stamp Counter (TSC) running frequency in MHz"
-	depends on TSC_CALIBRATION_BYPASS
-	default 1000
-	help
-	  The running frequency in MHz of Time-Stamp Counter (TSC).
-
 config HAVE_FSP
 	bool "Add an Firmware Support Package binary"
 	help
@@ -416,6 +396,26 @@ source "arch/x86/cpu/quark/Kconfig"
 
 source "arch/x86/cpu/queensbay/Kconfig"
 
+config TSC_CALIBRATION_BYPASS
+	bool "Bypass Time-Stamp Counter (TSC) calibration"
+	default n
+	help
+	  By default U-Boot automatically calibrates Time-Stamp Counter (TSC)
+	  running frequency via Model-Specific Register (MSR) and Programmable
+	  Interval Timer (PIT). If the calibration does not work on your board,
+	  select this option and provide a hardcoded TSC running frequency with
+	  CONFIG_TSC_FREQ_IN_MHZ below.
+
+	  Normally this option should be turned on in a simulation environment
+	  like qemu.
+
+config TSC_FREQ_IN_MHZ
+	int "Time-Stamp Counter (TSC) running frequency in MHz"
+	depends on TSC_CALIBRATION_BYPASS
+	default 1000
+	help
+	  The running frequency in MHz of Time-Stamp Counter (TSC).
+
 source "board/coreboot/coreboot/Kconfig"
 
 source "board/google/chromebook_link/Kconfig"
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 2/9] x86: quark: Bypass TSC calibration
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:24   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines Bin Meng
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

For some unknown reason, the TSC calibration via PIT does not work on
Quark. Enable bypassing TSC calibration and override TSC_FREQ_IN_MHZ
to 400 per Quark datasheet in the Kconfig.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
---

 arch/x86/cpu/quark/Kconfig | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/cpu/quark/Kconfig b/arch/x86/cpu/quark/Kconfig
index 163caac..bc961ef 100644
--- a/arch/x86/cpu/quark/Kconfig
+++ b/arch/x86/cpu/quark/Kconfig
@@ -7,6 +7,7 @@
 config INTEL_QUARK
 	bool
 	select HAVE_RMU
+	select TSC_CALIBRATION_BYPASS
 
 if INTEL_QUARK
 
@@ -118,4 +119,8 @@ config SYS_CAR_SIZE
 	  Space in bytes in eSRAM used as Cache-As-ARM (CAR).
 	  Note this size must not exceed eSRAM's total size.
 
+config TSC_FREQ_IN_MHZ
+	int
+	default 400
+
 endif
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ Bin Meng
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 2/9] x86: quark: Bypass TSC calibration Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:24   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC Bin Meng
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

Add the main routines for Quark Memory Reference Code (MRC).

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>

---
The are 24 checkpatch warnings in this patch, which is:

warning: arch/x86/cpu/quark/mrc.c,43: line over 80 characters
...

I intentionally leave it as is now, as fixing these warnings
make the mrc initialization table a little bit harder to read.

 arch/x86/cpu/quark/mrc.c              | 206 ++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/arch-quark/mrc.h | 189 +++++++++++++++++++++++++++++++
 2 files changed, 395 insertions(+)
 create mode 100644 arch/x86/cpu/quark/mrc.c
 create mode 100644 arch/x86/include/asm/arch-quark/mrc.h

diff --git a/arch/x86/cpu/quark/mrc.c b/arch/x86/cpu/quark/mrc.c
new file mode 100644
index 0000000..6a82519
--- /dev/null
+++ b/arch/x86/cpu/quark/mrc.c
@@ -0,0 +1,206 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+/*
+ * This is the main Quark Memory Reference Code (MRC)
+ *
+ * These functions are generic and should work for any Quark based board.
+ *
+ * MRC requires two data structures to be passed in which are initialized by
+ * mrc_adjust_params().
+ *
+ * The basic flow is as follows:
+ * 01) Check for supported DDR speed configuration
+ * 02) Set up Memory Manager buffer as pass-through (POR)
+ * 03) Set Channel Interleaving Mode and Channel Stride to the most aggressive
+ *     setting possible
+ * 04) Set up the Memory Controller logic
+ * 05) Set up the DDR_PHY logic
+ * 06) Initialise the DRAMs (JEDEC)
+ * 07) Perform the Receive Enable Calibration algorithm
+ * 08) Perform the Write Leveling algorithm
+ * 09) Perform the Read Training algorithm (includes internal Vref)
+ * 10) Perform the Write Training algorithm
+ * 11) Set Channel Interleaving Mode and Channel Stride to the desired settings
+ *
+ * Dunit configuration based on Valleyview MRC.
+ */
+
+#include <common.h>
+#include <asm/arch/mrc.h>
+#include <asm/arch/msg_port.h>
+#include "mrc_util.h"
+#include "smc.h"
+
+static const struct mem_init init[] = {
+	{ 0x0101, BM_COLD | BM_FAST | BM_WARM | BM_S3, clear_self_refresh       },
+	{ 0x0200, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_timing_control  },
+	{ 0x0103, BM_COLD | BM_FAST                  , prog_decode_before_jedec },
+	{ 0x0104, BM_COLD | BM_FAST                  , perform_ddr_reset        },
+	{ 0x0300, BM_COLD | BM_FAST           | BM_S3, ddrphy_init              },
+	{ 0x0400, BM_COLD | BM_FAST                  , perform_jedec_init       },
+	{ 0x0105, BM_COLD | BM_FAST                  , set_ddr_init_complete    },
+	{ 0x0106,           BM_FAST | BM_WARM | BM_S3, restore_timings          },
+	{ 0x0106, BM_COLD                            , default_timings          },
+	{ 0x0500, BM_COLD                            , rcvn_cal                 },
+	{ 0x0600, BM_COLD                            , wr_level                 },
+	{ 0x0120, BM_COLD                            , prog_page_ctrl           },
+	{ 0x0700, BM_COLD                            , rd_train                 },
+	{ 0x0800, BM_COLD                            , wr_train                 },
+	{ 0x010B, BM_COLD                            , store_timings            },
+	{ 0x010C, BM_COLD | BM_FAST | BM_WARM | BM_S3, enable_scrambling        },
+	{ 0x010D, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_control         },
+	{ 0x010E, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_dra_drb             },
+	{ 0x010F,                     BM_WARM | BM_S3, perform_wake             },
+	{ 0x0110, BM_COLD | BM_FAST | BM_WARM | BM_S3, change_refresh_period    },
+	{ 0x0111, BM_COLD | BM_FAST | BM_WARM | BM_S3, set_auto_refresh         },
+	{ 0x0112, BM_COLD | BM_FAST | BM_WARM | BM_S3, ecc_enable               },
+	{ 0x0113, BM_COLD | BM_FAST                  , memory_test              },
+	{ 0x0114, BM_COLD | BM_FAST | BM_WARM | BM_S3, lock_registers           }
+};
+
+/* Adjust configuration parameters before initialization sequence */
+static void mrc_adjust_params(struct mrc_params *mrc_params)
+{
+	const struct dram_params *dram_params;
+	uint8_t dram_width;
+	uint32_t rank_enables;
+	uint32_t channel_width;
+
+	ENTERFN();
+
+	/* initially expect success */
+	mrc_params->status = MRC_SUCCESS;
+
+	dram_width = mrc_params->dram_width;
+	rank_enables = mrc_params->rank_enables;
+	channel_width = mrc_params->channel_width;
+
+	/*
+	 * Setup board layout (must be reviewed as is selecting static timings)
+	 * 0 == R0 (DDR3 x16), 1 == R1 (DDR3 x16),
+	 * 2 == DV (DDR3 x8), 3 == SV (DDR3 x8).
+	 */
+	if (dram_width == X8)
+		mrc_params->board_id = 2;	/* select x8 layout */
+	else
+		mrc_params->board_id = 0;	/* select x16 layout */
+
+	/* initially no memory */
+	mrc_params->mem_size = 0;
+
+	/* begin of channel settings */
+	dram_params = &mrc_params->params;
+
+	/*
+	 * Determine Column Bits:
+	 *
+	 * Column: 11 for 8Gbx8, else 10
+	 */
+	mrc_params->column_bits[0] =
+		((dram_params[0].density == 4) &&
+		(dram_width == X8)) ? (11) : (10);
+
+	/*
+	 * Determine Row Bits:
+	 *
+	 * 512Mbx16=12 512Mbx8=13
+	 * 1Gbx16=13   1Gbx8=14
+	 * 2Gbx16=14   2Gbx8=15
+	 * 4Gbx16=15   4Gbx8=16
+	 * 8Gbx16=16   8Gbx8=16
+	 */
+	mrc_params->row_bits[0] = 12 + (dram_params[0].density) +
+		(((dram_params[0].density < 4) &&
+		(dram_width == X8)) ? (1) : (0));
+
+	/*
+	 * Determine Per Channel Memory Size:
+	 *
+	 * (For 2 RANKs, multiply by 2)
+	 * (For 16 bit data bus, divide by 2)
+	 *
+	 * DENSITY WIDTH MEM_AVAILABLE
+	 * 512Mb   x16   0x008000000 ( 128MB)
+	 * 512Mb   x8    0x010000000 ( 256MB)
+	 * 1Gb     x16   0x010000000 ( 256MB)
+	 * 1Gb     x8    0x020000000 ( 512MB)
+	 * 2Gb     x16   0x020000000 ( 512MB)
+	 * 2Gb     x8    0x040000000 (1024MB)
+	 * 4Gb     x16   0x040000000 (1024MB)
+	 * 4Gb     x8    0x080000000 (2048MB)
+	 */
+	mrc_params->channel_size[0] = (1 << dram_params[0].density);
+	mrc_params->channel_size[0] *= (dram_width == X8) ? (2) : (1);
+	mrc_params->channel_size[0] *= (rank_enables == 0x3) ? (2) : (1);
+	mrc_params->channel_size[0] *= (channel_width == X16) ? (1) : (2);
+
+	/* Determine memory size (convert number of 64MB/512Mb units) */
+	mrc_params->mem_size += mrc_params->channel_size[0] << 26;
+
+	LEAVEFN();
+}
+
+static void mrc_init(struct mrc_params *mrc_params)
+{
+	int i;
+
+	ENTERFN();
+
+	DPF(D_INFO, "mrc_init build %s %s\n", __DATE__, __TIME__);
+
+	/* MRC started */
+	mrc_post_code(0x01, 0x00);
+
+	if (mrc_params->boot_mode != BM_COLD) {
+		if (mrc_params->ddr_speed != mrc_params->timings.ddr_speed) {
+			/* full training required as frequency changed */
+			mrc_params->boot_mode = BM_COLD;
+		}
+	}
+
+	for (i = 0; i < ARRAY_SIZE(init); i++) {
+		uint64_t my_tsc;
+
+		if (mrc_params->boot_mode & init[i].boot_path) {
+			uint8_t major = init[i].post_code >> 8 & 0xFF;
+			uint8_t minor = init[i].post_code >> 0 & 0xFF;
+			mrc_post_code(major, minor);
+
+			my_tsc = rdtsc();
+			init[i].init_fn(mrc_params);
+			DPF(D_TIME, "Execution time %llx", rdtsc() - my_tsc);
+		}
+	}
+
+	/* display the timings */
+	print_timings(mrc_params);
+
+	/* MRC complete */
+	mrc_post_code(0x01, 0xFF);
+
+	LEAVEFN();
+}
+
+void mrc(struct mrc_params *mrc_params)
+{
+	ENTERFN();
+
+	DPF(D_INFO, "MRC Version %04x %s %s\n",
+	    MRC_VERSION, __DATE__, __TIME__);
+
+	/* Set up the data structures used by mrc_init() */
+	mrc_adjust_params(mrc_params);
+
+	/* Initialize system memory */
+	mrc_init(mrc_params);
+
+	LEAVEFN();
+}
diff --git a/arch/x86/include/asm/arch-quark/mrc.h b/arch/x86/include/asm/arch-quark/mrc.h
new file mode 100644
index 0000000..690a800
--- /dev/null
+++ b/arch/x86/include/asm/arch-quark/mrc.h
@@ -0,0 +1,189 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#ifndef _MRC_H_
+#define _MRC_H_
+
+/* MRC Version */
+#define MRC_VERSION	0x0111
+
+/* architectural definitions */
+#define NUM_CHANNELS	1	/* number of channels */
+#define NUM_RANKS	2	/* number of ranks per channel */
+#define NUM_BYTE_LANES	4	/* number of byte lanes per channel */
+
+/* software limitations */
+#define MAX_CHANNELS	1
+#define MAX_RANKS	2
+#define MAX_BYTE_LANES	4
+
+/* only to mock MrcWrapper */
+#define MAX_SOCKETS	1
+#define MAX_SIDES	1
+#define MAX_ROWS	(MAX_SIDES * MAX_SOCKETS)
+
+/* Specify DRAM of nenory channel width */
+enum {
+	X8,	/* DRAM width */
+	X16,	/* DRAM width & Channel Width */
+	X32	/* Channel Width */
+};
+
+/* Specify DRAM speed */
+enum {
+	DDRFREQ_800,
+	DDRFREQ_1066
+};
+
+/* Specify DRAM type */
+enum {
+	DDR3,
+	DDR3L
+};
+
+/*
+ * density: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb
+ * cl is DRAM CAS Latency in clocks
+ * All other timings are in picoseconds
+ *
+ * Refer to JEDEC spec (or DRAM datasheet) when changing these values.
+ */
+struct dram_params {
+	uint8_t density;
+	/* CAS latency in clocks */
+	uint8_t cl;
+	/* ACT to PRE command period */
+	uint32_t ras;
+	/*
+	 * Delay from start of internal write transaction to
+	 * internal read command
+	 */
+	uint32_t wtr;
+	/* ACT to ACT command period (JESD79 specific to page size 1K/2K) */
+	uint32_t rrd;
+	/* Four activate window (JESD79 specific to page size 1K/2K) */
+	uint32_t faw;
+};
+
+/*
+ * Delay configuration for individual signals
+ * Vref setting
+ * Scrambler seed
+ */
+struct mrc_timings {
+	uint32_t rcvn[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	uint32_t rdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	uint32_t wdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	uint32_t wdq[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	uint32_t vref[NUM_CHANNELS][NUM_BYTE_LANES];
+	uint32_t wctl[NUM_CHANNELS][NUM_RANKS];
+	uint32_t wcmd[NUM_CHANNELS];
+	uint32_t scrambler_seed;
+	/* need to save for the case of frequency change */
+	uint8_t ddr_speed;
+};
+
+/* Boot mode defined as bit mask (1<<n) */
+enum {
+	BM_UNKNOWN,
+	BM_COLD = 1,	/* full training */
+	BM_FAST = 2,	/* restore timing parameters */
+	BM_S3   = 4,	/* resume from S3 */
+	BM_WARM = 8
+};
+
+/* MRC execution status */
+#define MRC_SUCCESS	0	/* initialization ok */
+#define MRC_E_MEMTEST	1	/* memtest failed */
+
+/* Input/output/context parameters for Memory Reference Code */
+struct mrc_params {
+	/* Global Settings */
+
+	/* BM_COLD, BM_FAST, BM_WARM, BM_S3 */
+	uint32_t boot_mode;
+	uint8_t first_run;
+
+	/* DRAM Parameters */
+
+	uint8_t dram_width;		/* x8, x16 */
+	uint8_t ddr_speed;		/* DDRFREQ_800, DDRFREQ_1066 */
+	uint8_t ddr_type;		/* DDR3, DDR3L */
+	uint8_t ecc_enables;		/* 0, 1 (memory size reduced to 7/8) */
+	uint8_t scrambling_enables;	/* 0, 1 */
+	/* 1, 3 (1'st rank has to be populated if 2'nd rank present) */
+	uint32_t rank_enables;
+	uint32_t channel_enables;	/* 1 only */
+	uint32_t channel_width;		/* x16 only */
+	/* 0, 1, 2 (mode 2 forced if ecc enabled) */
+	uint32_t address_mode;
+	/* REFRESH_RATE: 1=1.95us, 2=3.9us, 3=7.8us, others=RESERVED */
+	uint8_t refresh_rate;
+	/* SR_TEMP_RANGE: 0=normal, 1=extended, others=RESERVED */
+	uint8_t sr_temp_range;
+	/*
+	 * RON_VALUE: 0=34ohm, 1=40ohm, others=RESERVED
+	 * (select MRS1.DIC driver impedance control)
+	 */
+	uint8_t ron_value;
+	/* RTT_NOM_VALUE: 0=40ohm, 1=60ohm, 2=120ohm, others=RESERVED */
+	uint8_t rtt_nom_value;
+	/* RD_ODT_VALUE: 0=off, 1=60ohm, 2=120ohm, 3=180ohm, others=RESERVED */
+	uint8_t rd_odt_value;
+	struct dram_params params;
+
+	/* Internally Used */
+
+	/* internally used for board layout (use x8 or x16 memory) */
+	uint32_t board_id;
+	/* when set hte reconfiguration requested */
+	uint32_t hte_setup:1;
+	uint32_t menu_after_mrc:1;
+	uint32_t power_down_disable:1;
+	uint32_t tune_rcvn:1;
+	uint32_t channel_size[NUM_CHANNELS];
+	uint32_t column_bits[NUM_CHANNELS];
+	uint32_t row_bits[NUM_CHANNELS];
+	/* register content saved during training */
+	uint32_t mrs1;
+
+	/* Output */
+
+	/* initialization result (non zero specifies error code) */
+	uint32_t status;
+	/* total memory size in bytes (excludes ECC banks) */
+	uint32_t mem_size;
+	/* training results (also used on input) */
+	struct mrc_timings timings;
+};
+
+struct mem_init {
+	uint16_t post_code;
+	uint16_t boot_path;
+	void (*init_fn)(struct mrc_params *mrc_params);
+};
+
+/* MRC platform data flags */
+#define MRC_FLAG_ECC_EN		0x00000001
+#define MRC_FLAG_SCRAMBLE_EN	0x00000002
+#define MRC_FLAG_MEMTEST_EN	0x00000004
+/* 0b DDR "fly-by" topology else 1b DDR "tree" topology */
+#define MRC_FLAG_TOP_TREE_EN	0x00000008
+/* If set ODR signal is asserted to DRAM devices on writes */
+#define MRC_FLAG_WR_ODT_EN	0x00000010
+
+/**
+ * mrc - Memory Reference Code entry routine
+ *
+ * @mrc_params: parameters for MRC
+ */
+void mrc(struct mrc_params *mrc_params);
+
+#endif /* _MRC_H_ */
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
                   ` (2 preceding siblings ...)
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:24   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support Bin Meng
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

Add various utility codes needed for Quark MRC.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>

---
There are 12 checkpatch warnings in this patch, which are:

warning: arch/x86/cpu/quark/mrc_util.c,1446: Too many leading tabs - consider code refactoring
warning: arch/x86/cpu/quark/mrc_util.c,1450: line over 80 characters
...

Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have
all the details on how Intel's MRC codes are actually written to play with
the hardware. Trying to refactor them may lead to a non-working MRC codes.
For the 'line over 80 characters' issue, we have to leave them as is now
due to the 'Too many leading tabs ...', sigh.

 arch/x86/cpu/quark/hte.c      |  398 +++++++++++
 arch/x86/cpu/quark/hte.h      |   44 ++
 arch/x86/cpu/quark/mrc_util.c | 1499 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/cpu/quark/mrc_util.h |  153 +++++
 4 files changed, 2094 insertions(+)
 create mode 100644 arch/x86/cpu/quark/hte.c
 create mode 100644 arch/x86/cpu/quark/hte.h
 create mode 100644 arch/x86/cpu/quark/mrc_util.c
 create mode 100644 arch/x86/cpu/quark/mrc_util.h

diff --git a/arch/x86/cpu/quark/hte.c b/arch/x86/cpu/quark/hte.c
new file mode 100644
index 0000000..d813c9c
--- /dev/null
+++ b/arch/x86/cpu/quark/hte.c
@@ -0,0 +1,398 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#include <common.h>
+#include <asm/arch/mrc.h>
+#include <asm/arch/msg_port.h>
+#include "mrc_util.h"
+#include "hte.h"
+
+/**
+ * This function enables HTE to detect all possible errors for
+ * the given training parameters (per-bit or full byte lane).
+ */
+static void hte_enable_all_errors(void)
+{
+	msg_port_write(HTE, 0x000200A2, 0xFFFFFFFF);
+	msg_port_write(HTE, 0x000200A3, 0x000000FF);
+	msg_port_write(HTE, 0x000200A4, 0x00000000);
+}
+
+/**
+ * This function goes and reads the HTE register in order to find any error
+ *
+ * @return: The errors detected in the HTE status register
+ */
+static u32 hte_check_errors(void)
+{
+	return msg_port_read(HTE, 0x000200A7);
+}
+
+/**
+ * This function waits until HTE finishes
+ */
+static void hte_wait_for_complete(void)
+{
+	u32 tmp;
+
+	ENTERFN();
+
+	do {} while ((msg_port_read(HTE, 0x00020012) & BIT30) != 0);
+
+	tmp = msg_port_read(HTE, 0x00020011);
+	tmp |= BIT9;
+	tmp &= ~(BIT12 | BIT13);
+	msg_port_write(HTE, 0x00020011, tmp);
+
+	LEAVEFN();
+}
+
+/**
+ * This function clears registers related with errors in the HTE
+ */
+static void hte_clear_error_regs(void)
+{
+	u32 tmp;
+
+	/*
+	 * Clear all HTE errors and enable error checking
+	 * for burst and chunk.
+	 */
+	tmp = msg_port_read(HTE, 0x000200A1);
+	tmp |= BIT8;
+	msg_port_write(HTE, 0x000200A1, tmp);
+}
+
+/**
+ * This function executes basic single cache line memory write/read/verify
+ * test using simple constant pattern, different for READ_RAIN and
+ * WRITE_TRAIN modes.
+ *
+ * See hte_basic_write_read() which is external visible wrapper.
+ *
+ * @mrc_params: host struture for all MRC global data
+ * @addr: memory adress being tested (must hit specific channel/rank)
+ * @first_run: if set then hte registers are configured, otherwise it is
+ *             assumed configuration is done and just re-run the test
+ * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern)
+ *
+ * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
+ */
+static u16 hte_basic_data_cmp(struct mrc_params *mrc_params, u32 addr,
+			      u8 first_run, u8 mode)
+{
+	u32 pattern;
+	u32 offset;
+
+	if (first_run) {
+		msg_port_write(HTE, 0x00020020, 0x01B10021);
+		msg_port_write(HTE, 0x00020021, 0x06000000);
+		msg_port_write(HTE, 0x00020022, addr >> 6);
+		msg_port_write(HTE, 0x00020062, 0x00800015);
+		msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
+		msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
+		msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
+		msg_port_write(HTE, 0x00020061, 0x00030008);
+
+		if (mode == WRITE_TRAIN)
+			pattern = 0xC33C0000;
+		else /* READ_TRAIN */
+			pattern = 0xAA5555AA;
+
+		for (offset = 0x80; offset <= 0x8F; offset++)
+			msg_port_write(HTE, offset, pattern);
+	}
+
+	msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
+	msg_port_write(HTE, 0x00020011, 0x00011000);
+	msg_port_write(HTE, 0x00020011, 0x00011100);
+
+	hte_wait_for_complete();
+
+	/*
+	 * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for
+	 * any bytelane errors.
+	 */
+	return (hte_check_errors() >> 8) & 0xFF;
+}
+
+/**
+ * This function examines single cache line memory with write/read/verify
+ * test using multiple data patterns (victim-aggressor algorithm).
+ *
+ * See hte_write_stress_bit_lanes() which is external visible wrapper.
+ *
+ * @mrc_params: host struture for all MRC global data
+ * @addr: memory adress being tested (must hit specific channel/rank)
+ * @loop_cnt: number of test iterations
+ * @seed_victim: victim data pattern seed
+ * @seed_aggressor: aggressor data pattern seed
+ * @victim_bit: should be 0 as auto rotate feature is in use
+ * @first_run: if set then hte registers are configured, otherwise it is
+ *             assumed configuration is done and just re-run the test
+ *
+ * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
+ */
+static u16 hte_rw_data_cmp(struct mrc_params *mrc_params, u32 addr,
+			   u8 loop_cnt, u32 seed_victim, u32 seed_aggressor,
+			   u8 victim_bit, u8 first_run)
+{
+	u32 offset;
+	u32 tmp;
+
+	if (first_run) {
+		msg_port_write(HTE, 0x00020020, 0x00910024);
+		msg_port_write(HTE, 0x00020023, 0x00810024);
+		msg_port_write(HTE, 0x00020021, 0x06070000);
+		msg_port_write(HTE, 0x00020024, 0x06070000);
+		msg_port_write(HTE, 0x00020022, addr >> 6);
+		msg_port_write(HTE, 0x00020025, addr >> 6);
+		msg_port_write(HTE, 0x00020062, 0x0000002A);
+		msg_port_write(HTE, 0x00020063, seed_victim);
+		msg_port_write(HTE, 0x00020064, seed_aggressor);
+		msg_port_write(HTE, 0x00020065, seed_victim);
+
+		/*
+		 * Write the pattern buffers to select the victim bit
+		 *
+		 * Start with bit0
+		 */
+		for (offset = 0x80; offset <= 0x8F; offset++) {
+			if ((offset % 8) == victim_bit)
+				msg_port_write(HTE, offset, 0x55555555);
+			else
+				msg_port_write(HTE, offset, 0xCCCCCCCC);
+		}
+
+		msg_port_write(HTE, 0x00020061, 0x00000000);
+		msg_port_write(HTE, 0x00020066, 0x03440000);
+		msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
+	}
+
+	tmp = 0x10001000 | (loop_cnt << 16);
+	msg_port_write(HTE, 0x00020011, tmp);
+	msg_port_write(HTE, 0x00020011, tmp | BIT8);
+
+	hte_wait_for_complete();
+
+	/*
+	 * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for
+	 * any bytelane errors.
+	 */
+	return (hte_check_errors() >> 8) & 0xFF;
+}
+
+/**
+ * This function uses HW HTE engine to initialize or test all memory attached
+ * to a given DUNIT. If flag is MRC_MEM_INIT, this routine writes 0s to all
+ * memory locations to initialize ECC. If flag is MRC_MEM_TEST, this routine
+ * will send an 5AA55AA5 pattern to all memory locations on the RankMask and
+ * then read it back. Then it sends an A55AA55A pattern to all memory locations
+ * on the RankMask and reads it back.
+ *
+ * @mrc_params: host struture for all MRC global data
+ * @flag: MRC_MEM_INIT or MRC_MEM_TEST
+ *
+ * @return: errors register showing HTE failures. Also prints out which rank
+ *          failed the HTE test if failure occurs. For rank detection to work,
+ *          the address map must be left in its default state. If MRC changes
+ *          the address map, this function must be modified to change it back
+ *          to default at the beginning, then restore it at the end.
+ */
+u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag)
+{
+	u32 offset;
+	int test_num;
+	int i;
+
+	/*
+	 * Clear out the error registers at the start of each memory
+	 * init or memory test run.
+	 */
+	hte_clear_error_regs();
+
+	msg_port_write(HTE, 0x00020062, 0x00000015);
+
+	for (offset = 0x80; offset <= 0x8F; offset++)
+		msg_port_write(HTE, offset, ((offset & 1) ? 0xA55A : 0x5AA5));
+
+	msg_port_write(HTE, 0x00020021, 0x00000000);
+	msg_port_write(HTE, 0x00020022, (mrc_params->mem_size >> 6) - 1);
+	msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
+	msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
+	msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
+	msg_port_write(HTE, 0x00020066, 0x03000000);
+
+	switch (flag) {
+	case MRC_MEM_INIT:
+		/*
+		 * Only 1 write pass through memory is needed
+		 * to initialize ECC
+		 */
+		test_num = 1;
+		break;
+	case MRC_MEM_TEST:
+		/* Write/read then write/read with inverted pattern */
+		test_num = 4;
+		break;
+	default:
+		DPF(D_INFO, "Unknown parameter for flag: %d\n", flag);
+		return 0xFFFFFFFF;
+	}
+
+	DPF(D_INFO, "hte_mem_init");
+
+	for (i = 0; i < test_num; i++) {
+		DPF(D_INFO, ".");
+
+		if (i == 0) {
+			msg_port_write(HTE, 0x00020061, 0x00000000);
+			msg_port_write(HTE, 0x00020020, 0x00110010);
+		} else if (i == 1) {
+			msg_port_write(HTE, 0x00020061, 0x00000000);
+			msg_port_write(HTE, 0x00020020, 0x00010010);
+		} else if (i == 2) {
+			msg_port_write(HTE, 0x00020061, 0x00010100);
+			msg_port_write(HTE, 0x00020020, 0x00110010);
+		} else {
+			msg_port_write(HTE, 0x00020061, 0x00010100);
+			msg_port_write(HTE, 0x00020020, 0x00010010);
+		}
+
+		msg_port_write(HTE, 0x00020011, 0x00111000);
+		msg_port_write(HTE, 0x00020011, 0x00111100);
+
+		hte_wait_for_complete();
+
+		/* If this is a READ pass, check for errors@the end */
+		if ((i % 2) == 1) {
+			/* Return immediately if error */
+			if (hte_check_errors())
+				break;
+		}
+	}
+
+	DPF(D_INFO, "done\n");
+
+	return hte_check_errors();
+}
+
+/**
+ * This function executes basic single cache line memory write/read/verify
+ * test using simple constant pattern, different for READ_RAIN and
+ * WRITE_TRAIN modes.
+ *
+ * @mrc_params: host struture for all MRC global data
+ * @addr: memory adress being tested (must hit specific channel/rank)
+ * @first_run: if set then hte registers are configured, otherwise it is
+ *             assumed configuration is done and just re-run the test
+ * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern)
+ *
+ * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
+ */
+u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr,
+			 u8 first_run, u8 mode)
+{
+	u16 errors;
+
+	ENTERFN();
+
+	/* Enable all error reporting in preparation for HTE test */
+	hte_enable_all_errors();
+	hte_clear_error_regs();
+
+	errors = hte_basic_data_cmp(mrc_params, addr, first_run, mode);
+
+	LEAVEFN();
+
+	return errors;
+}
+
+/**
+ * This function examines single cache line memory with write/read/verify
+ * test using multiple data patterns (victim-aggressor algorithm).
+ *
+ * @mrc_params: host struture for all MRC global data
+ * @addr: memory adress being tested (must hit specific channel/rank)
+ * @first_run: if set then hte registers are configured, otherwise it is
+ *             assumed configuration is done and just re-run the test
+ *
+ * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
+ */
+u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params,
+			       u32 addr, u8 first_run)
+{
+	u16 errors;
+	u8 victim_bit = 0;
+
+	ENTERFN();
+
+	/* Enable all error reporting in preparation for HTE test */
+	hte_enable_all_errors();
+	hte_clear_error_regs();
+
+	/*
+	 * Loop through each bit in the bytelane.
+	 *
+	 * Each pass creates a victim bit while keeping all other bits the same
+	 * as aggressors. AVN HTE adds an auto-rotate feature which allows us
+	 * to program the entire victim/aggressor sequence in 1 step.
+	 *
+	 * The victim bit rotates on each pass so no need to have software
+	 * implement a victim bit loop like on VLV.
+	 */
+	errors = hte_rw_data_cmp(mrc_params, addr, HTE_LOOP_CNT,
+				 HTE_LFSR_VICTIM_SEED, HTE_LFSR_AGRESSOR_SEED,
+				 victim_bit, first_run);
+
+	LEAVEFN();
+
+	return errors;
+}
+
+/**
+ * This function execute basic single cache line memory write or read.
+ * This is just for receive enable / fine write levelling purpose.
+ *
+ * @addr: memory adress being tested (must hit specific channel/rank)
+ * @first_run: if set then hte registers are configured, otherwise it is
+ *             assumed configuration is done and just re-run the test
+ * @is_write: when non-zero memory write operation executed, otherwise read
+ */
+void hte_mem_op(u32 addr, u8 first_run, u8 is_write)
+{
+	u32 offset;
+	u32 tmp;
+
+	hte_enable_all_errors();
+	hte_clear_error_regs();
+
+	if (first_run) {
+		tmp = is_write ? 0x01110021 : 0x01010021;
+		msg_port_write(HTE, 0x00020020, tmp);
+
+		msg_port_write(HTE, 0x00020021, 0x06000000);
+		msg_port_write(HTE, 0x00020022, addr >> 6);
+		msg_port_write(HTE, 0x00020062, 0x00800015);
+		msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
+		msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
+		msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
+		msg_port_write(HTE, 0x00020061, 0x00030008);
+
+		for (offset = 0x80; offset <= 0x8F; offset++)
+			msg_port_write(HTE, offset, 0xC33C0000);
+	}
+
+	msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
+	msg_port_write(HTE, 0x00020011, 0x00011000);
+	msg_port_write(HTE, 0x00020011, 0x00011100);
+
+	hte_wait_for_complete();
+}
diff --git a/arch/x86/cpu/quark/hte.h b/arch/x86/cpu/quark/hte.h
new file mode 100644
index 0000000..3a173ea
--- /dev/null
+++ b/arch/x86/cpu/quark/hte.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#ifndef _HTE_H_
+#define _HTE_H_
+
+enum {
+	MRC_MEM_INIT,
+	MRC_MEM_TEST
+};
+
+enum {
+	READ_TRAIN,
+	WRITE_TRAIN
+};
+
+/*
+ * EXP_LOOP_CNT field of HTE_CMD_CTL
+ *
+ * This CANNOT be less than 4!
+ */
+#define HTE_LOOP_CNT		5
+
+/* random seed for victim */
+#define HTE_LFSR_VICTIM_SEED	0xF294BA21
+
+/* random seed for aggressor */
+#define HTE_LFSR_AGRESSOR_SEED	0xEBA7492D
+
+u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag);
+u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr,
+			 u8 first_run, u8 mode);
+u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params,
+			       u32 addr, u8 first_run);
+void hte_mem_op(u32 addr, u8 first_run, u8 is_write);
+
+#endif /* _HTE_H_ */
diff --git a/arch/x86/cpu/quark/mrc_util.c b/arch/x86/cpu/quark/mrc_util.c
new file mode 100644
index 0000000..1ae42d6
--- /dev/null
+++ b/arch/x86/cpu/quark/mrc_util.c
@@ -0,0 +1,1499 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#include <common.h>
+#include <asm/arch/device.h>
+#include <asm/arch/mrc.h>
+#include <asm/arch/msg_port.h>
+#include "mrc_util.h"
+#include "hte.h"
+#include "smc.h"
+
+static const uint8_t vref_codes[64] = {
+	/* lowest to highest */
+	0x3F, 0x3E, 0x3D, 0x3C, 0x3B, 0x3A, 0x39, 0x38,
+	0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x31, 0x30,
+	0x2F, 0x2E, 0x2D, 0x2C, 0x2B, 0x2A, 0x29, 0x28,
+	0x27, 0x26, 0x25, 0x24, 0x23, 0x22, 0x21, 0x20,
+	0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+	0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
+	0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
+	0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F
+};
+
+void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask)
+{
+	msg_port_write(unit, addr,
+		       (msg_port_read(unit, addr) & ~(mask)) |
+		       ((data) & (mask)));
+}
+
+void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask)
+{
+	msg_port_alt_write(unit, addr,
+			   (msg_port_alt_read(unit, addr) & ~(mask)) |
+			   ((data) & (mask)));
+}
+
+void mrc_post_code(uint8_t major, uint8_t minor)
+{
+	/* send message to UART */
+	DPF(D_INFO, "POST: 0x%01x%02x\n", major, minor);
+
+	/* error check */
+	if (major == 0xEE)
+		hang();
+}
+
+/* Delay number of nanoseconds */
+void delay_n(uint32_t ns)
+{
+	/* 1000 MHz clock has 1ns period --> no conversion required */
+	uint64_t final_tsc = rdtsc();
+	final_tsc += ((get_tbclk_mhz() * ns) / 1000);
+
+	while (rdtsc() < final_tsc)
+		;
+}
+
+/* Delay number of microseconds */
+void delay_u(uint32_t ms)
+{
+	/* 64-bit math is not an option, just use loops */
+	while (ms--)
+		delay_n(1000);
+}
+
+/* Select Memory Manager as the source for PRI interface */
+void select_mem_mgr(void)
+{
+	u32 dco;
+
+	ENTERFN();
+
+	dco = msg_port_read(MEM_CTLR, DCO);
+	dco &= ~BIT28;
+	msg_port_write(MEM_CTLR, DCO, dco);
+
+	LEAVEFN();
+}
+
+/* Select HTE as the source for PRI interface */
+void select_hte(void)
+{
+	u32 dco;
+
+	ENTERFN();
+
+	dco = msg_port_read(MEM_CTLR, DCO);
+	dco |= BIT28;
+	msg_port_write(MEM_CTLR, DCO, dco);
+
+	LEAVEFN();
+}
+
+/*
+ * Send DRAM command
+ * data should be formated using DCMD_Xxxx macro or emrsXCommand structure
+ */
+void dram_init_command(uint32_t data)
+{
+	pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_DATA_REG, data);
+	pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_CTRL_EXT_REG, 0);
+	msg_port_setup(MSG_OP_DRAM_INIT, MEM_CTLR, 0);
+
+	DPF(D_REGWR, "WR32 %03X %08X %08X\n", MEM_CTLR, 0, data);
+}
+
+/* Send DRAM wake command using special MCU side-band WAKE opcode */
+void dram_wake_command(void)
+{
+	ENTERFN();
+
+	msg_port_setup(MSG_OP_DRAM_WAKE, MEM_CTLR, 0);
+
+	LEAVEFN();
+}
+
+void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane)
+{
+	/* send message to UART */
+	DPF(D_INFO, "CH%01X RK%01X BL%01X\n", channel, rank, byte_lane);
+}
+
+/*
+ * This function will program the RCVEN delays
+ *
+ * (currently doesn't comprehend rank)
+ */
+void set_rcvn(uint8_t channel, uint8_t rank,
+	      uint8_t byte_lane, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+
+	DPF(D_TRN, "Rcvn ch%d rnk%d ln%d : pi=%03X\n",
+	    channel, rank, byte_lane, pi_count);
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * BL0 -> B01PTRCTL0[11:08] (0x0-0xF)
+	 * BL1 -> B01PTRCTL0[23:20] (0x0-0xF)
+	 */
+	reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	msk = (byte_lane & BIT0) ? (BIT23 | BIT22 | BIT21 | BIT20) :
+		(BIT11 | BIT10 | BIT9 | BIT8);
+	temp = (byte_lane & BIT0) ? ((pi_count / HALF_CLK) << 20) :
+		((pi_count / HALF_CLK) << 8);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* Adjust PI_COUNT */
+	pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F)
+	 * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F)
+	 */
+	reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
+	temp = pi_count << 24;
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/*
+	 * DEADBAND
+	 * BL0/1 -> B01DBCTL1[08/11] (+1 select)
+	 * BL0/1 -> B01DBCTL1[02/05] (enable)
+	 */
+	reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	msk = 0x00;
+	temp = 0x00;
+
+	/* enable */
+	msk |= (byte_lane & BIT0) ? (BIT5) : (BIT2);
+	if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
+		temp |= msk;
+
+	/* select */
+	msk |= (byte_lane & BIT0) ? (BIT11) : (BIT8);
+	if (pi_count < EARLY_DB)
+		temp |= msk;
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check */
+	if (pi_count > 0x3F) {
+		training_message(channel, rank, byte_lane);
+		mrc_post_code(0xEE, 0xE0);
+	}
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the current RCVEN delay on the given
+ * channel, rank, byte_lane as an absolute PI count.
+ *
+ * (currently doesn't comprehend rank)
+ */
+uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * BL0 -> B01PTRCTL0[11:08] (0x0-0xF)
+	 * BL1 -> B01PTRCTL0[23:20] (0x0-0xF)
+	 */
+	reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= (byte_lane & BIT0) ? (20) : (8);
+	temp &= 0xF;
+
+	/* Adjust PI_COUNT */
+	pi_count = temp * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F)
+	 * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F)
+	 */
+	reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 24;
+	temp &= 0x3F;
+
+	/* Adjust PI_COUNT */
+	pi_count += temp;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the RDQS delays based on an absolute
+ * amount of PIs.
+ *
+ * (currently doesn't comprehend rank)
+ */
+void set_rdqs(uint8_t channel, uint8_t rank,
+	      uint8_t byte_lane, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+	DPF(D_TRN, "Rdqs ch%d rnk%d ln%d : pi=%03X\n",
+	    channel, rank, byte_lane, pi_count);
+
+	/*
+	 * PI (1/128 MCLK)
+	 * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47)
+	 * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47)
+	 */
+	reg = (byte_lane & BIT0) ? (B1RXDQSPICODE) : (B0RXDQSPICODE);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	msk = (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0);
+	temp = pi_count << 0;
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check (shouldn't go above 0x3F) */
+	if (pi_count > 0x47) {
+		training_message(channel, rank, byte_lane);
+		mrc_post_code(0xEE, 0xE1);
+	}
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the current RDQS delay on the given
+ * channel, rank, byte_lane as an absolute PI count.
+ *
+ * (currently doesn't comprehend rank)
+ */
+uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * PI (1/128 MCLK)
+	 * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47)
+	 * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47)
+	 */
+	reg = (byte_lane & BIT0) ? (B1RXDQSPICODE) : (B0RXDQSPICODE);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	temp = msg_port_alt_read(DDRPHY, reg);
+
+	/* Adjust PI_COUNT */
+	pi_count = temp & 0x7F;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the WDQS delays based on an absolute
+ * amount of PIs.
+ *
+ * (currently doesn't comprehend rank)
+ */
+void set_wdqs(uint8_t channel, uint8_t rank,
+	      uint8_t byte_lane, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+
+	DPF(D_TRN, "Wdqs ch%d rnk%d ln%d : pi=%03X\n",
+	    channel, rank, byte_lane, pi_count);
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * BL0 -> B01PTRCTL0[07:04] (0x0-0xF)
+	 * BL1 -> B01PTRCTL0[19:16] (0x0-0xF)
+	 */
+	reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	msk = (byte_lane & BIT0) ? (BIT19 | BIT18 | BIT17 | BIT16) :
+		(BIT7 | BIT6 | BIT5 | BIT4);
+	temp = pi_count / HALF_CLK;
+	temp <<= (byte_lane & BIT0) ? (16) : (4);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* Adjust PI_COUNT */
+	pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F)
+	 * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F)
+	 */
+	reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16);
+	temp = pi_count << 16;
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/*
+	 * DEADBAND
+	 * BL0/1 -> B01DBCTL1[07/10] (+1 select)
+	 * BL0/1 -> B01DBCTL1[01/04] (enable)
+	 */
+	reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	msk = 0x00;
+	temp = 0x00;
+
+	/* enable */
+	msk |= (byte_lane & BIT0) ? (BIT4) : (BIT1);
+	if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
+		temp |= msk;
+
+	/* select */
+	msk |= (byte_lane & BIT0) ? (BIT10) : (BIT7);
+	if (pi_count < EARLY_DB)
+		temp |= msk;
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check */
+	if (pi_count > 0x3F) {
+		training_message(channel, rank, byte_lane);
+		mrc_post_code(0xEE, 0xE2);
+	}
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the amount of WDQS delay on the given
+ * channel, rank, byte_lane as an absolute PI count.
+ *
+ * (currently doesn't comprehend rank)
+ */
+uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * BL0 -> B01PTRCTL0[07:04] (0x0-0xF)
+	 * BL1 -> B01PTRCTL0[19:16] (0x0-0xF)
+	 */
+	reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= (byte_lane & BIT0) ? (16) : (4);
+	temp &= 0xF;
+
+	/* Adjust PI_COUNT */
+	pi_count = (temp * HALF_CLK);
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F)
+	 * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F)
+	 */
+	reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 16;
+	temp &= 0x3F;
+
+	/* Adjust PI_COUNT */
+	pi_count += temp;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the WDQ delays based on an absolute
+ * number of PIs.
+ *
+ * (currently doesn't comprehend rank)
+ */
+void set_wdq(uint8_t channel, uint8_t rank,
+	     uint8_t byte_lane, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+
+	DPF(D_TRN, "Wdq ch%d rnk%d ln%d : pi=%03X\n",
+	    channel, rank, byte_lane, pi_count);
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * BL0 -> B01PTRCTL0[03:00] (0x0-0xF)
+	 * BL1 -> B01PTRCTL0[15:12] (0x0-0xF)
+	 */
+	reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	msk = (byte_lane & BIT0) ? (BIT15 | BIT14 | BIT13 | BIT12) :
+		(BIT3 | BIT2 | BIT1 | BIT0);
+	temp = pi_count / HALF_CLK;
+	temp <<= (byte_lane & BIT0) ? (12) : (0);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* Adjust PI_COUNT */
+	pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F)
+	 * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F)
+	 */
+	reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	msk = (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
+	temp = pi_count << 8;
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/*
+	 * DEADBAND
+	 * BL0/1 -> B01DBCTL1[06/09] (+1 select)
+	 * BL0/1 -> B01DBCTL1[00/03] (enable)
+	 */
+	reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	msk = 0x00;
+	temp = 0x00;
+
+	/* enable */
+	msk |= (byte_lane & BIT0) ? (BIT3) : (BIT0);
+	if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
+		temp |= msk;
+
+	/* select */
+	msk |= (byte_lane & BIT0) ? (BIT9) : (BIT6);
+	if (pi_count < EARLY_DB)
+		temp |= msk;
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check */
+	if (pi_count > 0x3F) {
+		training_message(channel, rank, byte_lane);
+		mrc_post_code(0xEE, 0xE3);
+	}
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the amount of WDQ delay on the given
+ * channel, rank, byte_lane as an absolute PI count.
+ *
+ * (currently doesn't comprehend rank)
+ */
+uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * BL0 -> B01PTRCTL0[03:00] (0x0-0xF)
+	 * BL1 -> B01PTRCTL0[15:12] (0x0-0xF)
+	 */
+	reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= (byte_lane & BIT0) ? (12) : (0);
+	temp &= 0xF;
+
+	/* Adjust PI_COUNT */
+	pi_count = (temp * HALF_CLK);
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F)
+	 * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F)
+	 */
+	reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
+	reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
+		(channel * DDRIODQ_CH_OFFSET));
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 8;
+	temp &= 0x3F;
+
+	/* Adjust PI_COUNT */
+	pi_count += temp;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the WCMD delays based on an absolute
+ * number of PIs.
+ */
+void set_wcmd(uint8_t channel, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * CMDPTRREG[11:08] (0x0-0xF)
+	 */
+	reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET);
+	msk = (BIT11 | BIT10 | BIT9 | BIT8);
+	temp = pi_count / HALF_CLK;
+	temp <<= 8;
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* Adjust PI_COUNT */
+	pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused)
+	 * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused)
+	 * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused)
+	 * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused)
+	 * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused)
+	 * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F)
+	 * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused)
+	 * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused)
+	 */
+	reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
+
+	msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
+		BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
+		BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+		BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0);
+
+	temp = (pi_count << 24) | (pi_count << 16) |
+		(pi_count << 8) | (pi_count << 0);
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = CMDDLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);	/* PO */
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/*
+	 * DEADBAND
+	 * CMDCFGREG0[17] (+1 select)
+	 * CMDCFGREG0[16] (enable)
+	 */
+	reg = CMDCFGREG0 + (channel * DDRIOCCC_CH_OFFSET);
+	msk = 0x00;
+	temp = 0x00;
+
+	/* enable */
+	msk |= BIT16;
+	if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
+		temp |= msk;
+
+	/* select */
+	msk |= BIT17;
+	if (pi_count < EARLY_DB)
+		temp |= msk;
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check */
+	if (pi_count > 0x3F)
+		mrc_post_code(0xEE, 0xE4);
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the amount of WCMD delay on the given
+ * channel as an absolute PI count.
+ */
+uint32_t get_wcmd(uint8_t channel)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * CMDPTRREG[11:08] (0x0-0xF)
+	 */
+	reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 8;
+	temp &= 0xF;
+
+	/* Adjust PI_COUNT */
+	pi_count = temp * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused)
+	 * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused)
+	 * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused)
+	 * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused)
+	 * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused)
+	 * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F)
+	 * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused)
+	 * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused)
+	 */
+	reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 16;
+	temp &= 0x3F;
+
+	/* Adjust PI_COUNT */
+	pi_count += temp;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the WCLK delays based on an absolute
+ * number of PIs.
+ */
+void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * CCPTRREG[15:12] -> CLK1 (0x0-0xF)
+	 * CCPTRREG[11:08] -> CLK0 (0x0-0xF)
+	 */
+	reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
+	msk = (BIT15 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
+	temp = ((pi_count / HALF_CLK) << 12) | ((pi_count / HALF_CLK) << 8);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* Adjust PI_COUNT */
+	pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F)
+	 * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F)
+	 */
+	reg = (rank) ? (ECCB1DLLPICODER0) : (ECCB1DLLPICODER0);
+	reg += (channel * DDRIOCCC_CH_OFFSET);
+	msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
+		BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
+	temp = (pi_count << 16) | (pi_count << 8);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = (rank) ? (ECCB1DLLPICODER1) : (ECCB1DLLPICODER1);
+	reg += (channel * DDRIOCCC_CH_OFFSET);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = (rank) ? (ECCB1DLLPICODER2) : (ECCB1DLLPICODER2);
+	reg += (channel * DDRIOCCC_CH_OFFSET);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = (rank) ? (ECCB1DLLPICODER3) : (ECCB1DLLPICODER3);
+	reg += (channel * DDRIOCCC_CH_OFFSET);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/*
+	 * DEADBAND
+	 * CCCFGREG1[11:08] (+1 select)
+	 * CCCFGREG1[03:00] (enable)
+	 */
+	reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET);
+	msk = 0x00;
+	temp = 0x00;
+
+	/* enable */
+	msk |= (BIT3 | BIT2 | BIT1 | BIT0);
+	if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
+		temp |= msk;
+
+	/* select */
+	msk |= (BIT11 | BIT10 | BIT9 | BIT8);
+	if (pi_count < EARLY_DB)
+		temp |= msk;
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check */
+	if (pi_count > 0x3F)
+		mrc_post_code(0xEE, 0xE5);
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the amout of WCLK delay on the given
+ * channel, rank as an absolute PI count.
+ */
+uint32_t get_wclk(uint8_t channel, uint8_t rank)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * CCPTRREG[15:12] -> CLK1 (0x0-0xF)
+	 * CCPTRREG[11:08] -> CLK0 (0x0-0xF)
+	 */
+	reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= (rank) ? (12) : (8);
+	temp &= 0xF;
+
+	/* Adjust PI_COUNT */
+	pi_count = temp * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F)
+	 * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F)
+	 */
+	reg = (rank) ? (ECCB1DLLPICODER0) : (ECCB1DLLPICODER0);
+	reg += (channel * DDRIOCCC_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= (rank) ? (16) : (8);
+	temp &= 0x3F;
+
+	pi_count += temp;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the WCTL delays based on an absolute
+ * number of PIs.
+ *
+ * (currently doesn't comprehend rank)
+ */
+void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count)
+{
+	uint32_t reg;
+	uint32_t msk;
+	uint32_t temp;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * CCPTRREG[31:28] (0x0-0xF)
+	 * CCPTRREG[27:24] (0x0-0xF)
+	 */
+	reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
+	msk = (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
+	temp = ((pi_count / HALF_CLK) << 28) | ((pi_count / HALF_CLK) << 24);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* Adjust PI_COUNT */
+	pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
+	 * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
+	 */
+	reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);
+	msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
+	temp = (pi_count << 24);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = ECCB1DLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = ECCB1DLLPICODER2 + (channel * DDRIOCCC_CH_OFFSET);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+	reg = ECCB1DLLPICODER3 + (channel * DDRIOCCC_CH_OFFSET);
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/*
+	 * DEADBAND
+	 * CCCFGREG1[13:12] (+1 select)
+	 * CCCFGREG1[05:04] (enable)
+	 */
+	reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET);
+	msk = 0x00;
+	temp = 0x00;
+
+	/* enable */
+	msk |= (BIT5 | BIT4);
+	if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
+		temp |= msk;
+
+	/* select */
+	msk |= (BIT13 | BIT12);
+	if (pi_count < EARLY_DB)
+		temp |= msk;
+
+	mrc_alt_write_mask(DDRPHY, reg, temp, msk);
+
+	/* error check */
+	if (pi_count > 0x3F)
+		mrc_post_code(0xEE, 0xE6);
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the amount of WCTL delay on the given
+ * channel, rank as an absolute PI count.
+ *
+ * (currently doesn't comprehend rank)
+ */
+uint32_t get_wctl(uint8_t channel, uint8_t rank)
+{
+	uint32_t reg;
+	uint32_t temp;
+	uint32_t pi_count;
+
+	ENTERFN();
+
+	/*
+	 * RDPTR (1/2 MCLK, 64 PIs)
+	 * CCPTRREG[31:28] (0x0-0xF)
+	 * CCPTRREG[27:24] (0x0-0xF)
+	 */
+	reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 24;
+	temp &= 0xF;
+
+	/* Adjust PI_COUNT */
+	pi_count = temp * HALF_CLK;
+
+	/*
+	 * PI (1/64 MCLK, 1 PIs)
+	 * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
+	 * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
+	 */
+	reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);
+	temp = msg_port_alt_read(DDRPHY, reg);
+	temp >>= 24;
+	temp &= 0x3F;
+
+	/* Adjust PI_COUNT */
+	pi_count += temp;
+
+	LEAVEFN();
+
+	return pi_count;
+}
+
+/*
+ * This function will program the internal Vref setting in a given
+ * byte lane in a given channel.
+ */
+void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting)
+{
+	uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL);
+
+	ENTERFN();
+
+	DPF(D_TRN, "Vref ch%d ln%d : val=%03X\n",
+	    channel, byte_lane, setting);
+
+	mrc_alt_write_mask(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) +
+		((byte_lane >> 1) * DDRIODQ_BL_OFFSET)),
+		(vref_codes[setting] << 2),
+		(BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
+
+	/*
+	 * need to wait ~300ns for Vref to settle
+	 * (check that this is necessary)
+	 */
+	delay_n(300);
+
+	/* ??? may need to clear pointers ??? */
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return the internal Vref setting for the given
+ * channel, byte_lane.
+ */
+uint32_t get_vref(uint8_t channel, uint8_t byte_lane)
+{
+	uint8_t j;
+	uint32_t ret_val = sizeof(vref_codes) / 2;
+	uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL);
+	uint32_t temp;
+
+	ENTERFN();
+
+	temp = msg_port_alt_read(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) +
+		((byte_lane >> 1) * DDRIODQ_BL_OFFSET)));
+	temp >>= 2;
+	temp &= 0x3F;
+
+	for (j = 0; j < sizeof(vref_codes); j++) {
+		if (vref_codes[j] == temp) {
+			ret_val = j;
+			break;
+		}
+	}
+
+	LEAVEFN();
+
+	return ret_val;
+}
+
+/*
+ * This function will return a 32 bit address in the desired
+ * channel and rank.
+ */
+uint32_t get_addr(uint8_t channel, uint8_t rank)
+{
+	uint32_t offset = 0x02000000;	/* 32MB */
+
+	/* Begin product specific code */
+	if (channel > 0) {
+		DPF(D_ERROR, "ILLEGAL CHANNEL\n");
+		DEAD_LOOP();
+	}
+
+	if (rank > 1) {
+		DPF(D_ERROR, "ILLEGAL RANK\n");
+		DEAD_LOOP();
+	}
+
+	/* use 256MB lowest density as per DRP == 0x0003 */
+	offset += rank * (256 * 1024 * 1024);
+
+	return offset;
+}
+
+/*
+ * This function will sample the DQTRAINSTS registers in the given
+ * channel/rank SAMPLE_SIZE times looking for a valid '0' or '1'.
+ *
+ * It will return an encoded DWORD in which each bit corresponds to
+ * the sampled value on the byte lane.
+ */
+uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel,
+		    uint8_t rank, bool rcvn)
+{
+	uint8_t j;	/* just a counter */
+	uint8_t bl;	/* which BL in the module (always 2 per module) */
+	uint8_t bl_grp;	/* which BL module */
+	/* byte lane divisor */
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+	uint32_t msk[2];	/* BLx in module */
+	/* DQTRAINSTS register contents for each sample */
+	uint32_t sampled_val[SAMPLE_SIZE];
+	uint32_t num_0s;	/* tracks the number of '0' samples */
+	uint32_t num_1s;	/* tracks the number of '1' samples */
+	uint32_t ret_val = 0x00;	/* assume all '0' samples */
+	uint32_t address = get_addr(channel, rank);
+
+	/* initialise msk[] */
+	msk[0] = (rcvn) ? (BIT1) : (BIT9);	/* BL0 */
+	msk[1] = (rcvn) ? (BIT0) : (BIT8);	/* BL1 */
+
+	/* cycle through each byte lane group */
+	for (bl_grp = 0; bl_grp < (NUM_BYTE_LANES / bl_divisor) / 2; bl_grp++) {
+		/* take SAMPLE_SIZE samples */
+		for (j = 0; j < SAMPLE_SIZE; j++) {
+			hte_mem_op(address, mrc_params->first_run,
+				   rcvn ? 0 : 1);
+			mrc_params->first_run = 0;
+
+			/*
+			 * record the contents of the proper
+			 * DQTRAINSTS register
+			 */
+			sampled_val[j] = msg_port_alt_read(DDRPHY,
+				(DQTRAINSTS +
+				(bl_grp * DDRIODQ_BL_OFFSET) +
+				(channel * DDRIODQ_CH_OFFSET)));
+		}
+
+		/*
+		 * look for a majority value (SAMPLE_SIZE / 2) + 1
+		 * on the byte lane and set that value in the corresponding
+		 * ret_val bit
+		 */
+		for (bl = 0; bl < 2; bl++) {
+			num_0s = 0x00;	/* reset '0' tracker for byte lane */
+			num_1s = 0x00;	/* reset '1' tracker for byte lane */
+			for (j = 0; j < SAMPLE_SIZE; j++) {
+				if (sampled_val[j] & msk[bl])
+					num_1s++;
+				else
+					num_0s++;
+			}
+		if (num_1s > num_0s)
+			ret_val |= (1 << (bl + (bl_grp * 2)));
+		}
+	}
+
+	/*
+	 * "ret_val.0" contains the status of BL0
+	 * "ret_val.1" contains the status of BL1
+	 * "ret_val.2" contains the status of BL2
+	 * etc.
+	 */
+	return ret_val;
+}
+
+/* This function will find the rising edge transition on RCVN or WDQS */
+void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[],
+		      uint8_t channel, uint8_t rank, bool rcvn)
+{
+	bool all_edges_found;	/* determines stop condition */
+	bool direction[NUM_BYTE_LANES];	/* direction indicator */
+	uint8_t sample;	/* sample counter */
+	uint8_t bl;	/* byte lane counter */
+	/* byte lane divisor */
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+	uint32_t sample_result[SAMPLE_CNT];	/* results of sample_dqs() */
+	uint32_t temp;
+	uint32_t transition_pattern;
+
+	ENTERFN();
+
+	/* select hte and request initial configuration */
+	select_hte();
+	mrc_params->first_run = 1;
+
+	/* Take 3 sample points (T1,T2,T3) to obtain a transition pattern */
+	for (sample = 0; sample < SAMPLE_CNT; sample++) {
+		/* program the desired delays for sample */
+		for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+			/* increase sample delay by 26 PI (0.2 CLK) */
+			if (rcvn) {
+				set_rcvn(channel, rank, bl,
+					 delay[bl] + (sample * SAMPLE_DLY));
+			} else {
+				set_wdqs(channel, rank, bl,
+					 delay[bl] + (sample * SAMPLE_DLY));
+			}
+		}
+
+		/* take samples (Tsample_i) */
+		sample_result[sample] = sample_dqs(mrc_params,
+			channel, rank, rcvn);
+
+		DPF(D_TRN,
+		    "Find rising edge %s ch%d rnk%d: #%d dly=%d dqs=%02X\n",
+		    (rcvn ? "RCVN" : "WDQS"), channel, rank, sample,
+		    sample * SAMPLE_DLY, sample_result[sample]);
+	}
+
+	/*
+	 * This pattern will help determine where we landed and ultimately
+	 * how to place RCVEN/WDQS.
+	 */
+	for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+		/* build transition_pattern (MSB is 1st sample) */
+		transition_pattern = 0;
+		for (sample = 0; sample < SAMPLE_CNT; sample++) {
+			transition_pattern |=
+				((sample_result[sample] & (1 << bl)) >> bl) <<
+				(SAMPLE_CNT - 1 - sample);
+		}
+
+		DPF(D_TRN, "=== transition pattern %d\n", transition_pattern);
+
+		/*
+		 * set up to look for rising edge based on
+		 * transition_pattern
+		 */
+		switch (transition_pattern) {
+		case 0:	/* sampled 0->0->0 */
+			/* move forward from T3 looking for 0->1 */
+			delay[bl] += 2 * SAMPLE_DLY;
+			direction[bl] = FORWARD;
+			break;
+		case 1:	/* sampled 0->0->1 */
+		case 5:	/* sampled 1->0->1 (bad duty cycle) *HSD#237503* */
+			/* move forward from T2 looking for 0->1 */
+			delay[bl] += 1 * SAMPLE_DLY;
+			direction[bl] = FORWARD;
+			break;
+		case 2:	/* sampled 0->1->0 (bad duty cycle) *HSD#237503* */
+		case 3:	/* sampled 0->1->1 */
+			/* move forward from T1 looking for 0->1 */
+			delay[bl] += 0 * SAMPLE_DLY;
+			direction[bl] = FORWARD;
+			break;
+		case 4:	/* sampled 1->0->0 (assumes BL8, HSD#234975) */
+			/* move forward from T3 looking for 0->1 */
+			delay[bl] += 2 * SAMPLE_DLY;
+			direction[bl] = FORWARD;
+			break;
+		case 6:	/* sampled 1->1->0 */
+		case 7:	/* sampled 1->1->1 */
+			/* move backward from T1 looking for 1->0 */
+			delay[bl] += 0 * SAMPLE_DLY;
+			direction[bl] = BACKWARD;
+			break;
+		default:
+			mrc_post_code(0xEE, 0xEE);
+			break;
+		}
+
+		/* program delays */
+		if (rcvn)
+			set_rcvn(channel, rank, bl, delay[bl]);
+		else
+			set_wdqs(channel, rank, bl, delay[bl]);
+	}
+
+	/*
+	 * Based on the observed transition pattern on the byte lane,
+	 * begin looking for a rising edge with single PI granularity.
+	 */
+	do {
+		all_edges_found = true;	/* assume all byte lanes passed */
+		/* take a sample */
+		temp = sample_dqs(mrc_params, channel, rank, rcvn);
+		/* check all each byte lane for proper edge */
+		for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+			if (temp & (1 << bl)) {
+				/* sampled "1" */
+				if (direction[bl] == BACKWARD) {
+					/*
+					 * keep looking for edge
+					 * on this byte lane
+					 */
+					all_edges_found = false;
+					delay[bl] -= 1;
+					if (rcvn) {
+						set_rcvn(channel, rank,
+							 bl, delay[bl]);
+					} else {
+						set_wdqs(channel, rank,
+							 bl, delay[bl]);
+					}
+				}
+			} else {
+				/* sampled "0" */
+				if (direction[bl] == FORWARD) {
+					/*
+					 * keep looking for edge
+					 * on this byte lane
+					 */
+					all_edges_found = false;
+					delay[bl] += 1;
+					if (rcvn) {
+						set_rcvn(channel, rank,
+							 bl, delay[bl]);
+					} else {
+						set_wdqs(channel, rank,
+							 bl, delay[bl]);
+					}
+				}
+			}
+		}
+	} while (!all_edges_found);
+
+	/* restore DDR idle state */
+	dram_init_command(DCMD_PREA(rank));
+
+	DPF(D_TRN, "Delay %03X %03X %03X %03X\n",
+	    delay[0], delay[1], delay[2], delay[3]);
+
+	LEAVEFN();
+}
+
+/*
+ * This function will return a 32 bit mask that will be used to
+ * check for byte lane failures.
+ */
+uint32_t byte_lane_mask(struct mrc_params *mrc_params)
+{
+	uint32_t j;
+	uint32_t ret_val = 0x00;
+
+	/*
+	 * set ret_val based on NUM_BYTE_LANES such that you will check
+	 * only BL0 in result
+	 *
+	 * (each bit in result represents a byte lane)
+	 */
+	for (j = 0; j < MAX_BYTE_LANES; j += NUM_BYTE_LANES)
+		ret_val |= (1 << ((j / NUM_BYTE_LANES) * NUM_BYTE_LANES));
+
+	/*
+	 * HSD#235037
+	 * need to adjust the mask for 16-bit mode
+	 */
+	if (mrc_params->channel_width == X16)
+		ret_val |= (ret_val << 2);
+
+	return ret_val;
+}
+
+/*
+ * Check memory executing simple write/read/verify@the specified address.
+ *
+ * Bits in the result indicate failure on specific byte lane.
+ */
+uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address)
+{
+	uint32_t result = 0;
+	uint8_t first_run = 0;
+
+	if (mrc_params->hte_setup) {
+		mrc_params->hte_setup = 0;
+		first_run = 1;
+		select_hte();
+	}
+
+	result = hte_basic_write_read(mrc_params, address,
+				      first_run, WRITE_TRAIN);
+
+	DPF(D_TRN, "check_rw_coarse result is %x\n", result);
+
+	return result;
+}
+
+/*
+ * Check memory executing write/read/verify of many data patterns
+ * at the specified address. Bits in the result indicate failure
+ * on specific byte lane.
+ */
+uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address)
+{
+	uint32_t result;
+	uint8_t first_run = 0;
+
+	if (mrc_params->hte_setup) {
+		mrc_params->hte_setup = 0;
+		first_run = 1;
+		select_hte();
+	}
+
+	result = hte_write_stress_bit_lanes(mrc_params, address, first_run);
+
+	DPF(D_TRN, "check_bls_ex result is %x\n", result);
+
+	return result;
+}
+
+/*
+ * 32-bit LFSR with characteristic polynomial: X^32 + X^22 +X^2 + X^1
+ *
+ * The function takes pointer to previous 32 bit value and
+ * modifies it to next value.
+ */
+void lfsr32(uint32_t *lfsr_ptr)
+{
+	uint32_t bit;
+	uint32_t lfsr;
+	int i;
+
+	lfsr = *lfsr_ptr;
+
+	for (i = 0; i < 32; i++) {
+		bit = 1 ^ (lfsr & BIT0);
+		bit = bit ^ ((lfsr & BIT1) >> 1);
+		bit = bit ^ ((lfsr & BIT2) >> 2);
+		bit = bit ^ ((lfsr & BIT22) >> 22);
+
+		lfsr = ((lfsr >> 1) | (bit << 31));
+	}
+
+	*lfsr_ptr = lfsr;
+}
+
+/* Clear the pointers in a given byte lane in a given channel */
+void clear_pointers(void)
+{
+	uint8_t channel;
+	uint8_t bl;
+
+	ENTERFN();
+
+	for (channel = 0; channel < NUM_CHANNELS; channel++) {
+		for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
+			mrc_alt_write_mask(DDRPHY,
+					   (B01PTRCTL1 +
+					   (channel * DDRIODQ_CH_OFFSET) +
+					   ((bl >> 1) * DDRIODQ_BL_OFFSET)),
+					   ~BIT8, BIT8);
+
+			mrc_alt_write_mask(DDRPHY,
+					   (B01PTRCTL1 +
+					   (channel * DDRIODQ_CH_OFFSET) +
+					   ((bl >> 1) * DDRIODQ_BL_OFFSET)),
+					   BIT8, BIT8);
+		}
+	}
+
+	LEAVEFN();
+}
+
+void print_timings(struct mrc_params *mrc_params)
+{
+	uint8_t algo;
+	uint8_t channel;
+	uint8_t rank;
+	uint8_t bl;
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+
+	DPF(D_INFO, "\n---------------------------");
+	DPF(D_INFO, "\nALGO[CH:RK] BL0 BL1 BL2 BL3");
+	DPF(D_INFO, "\n===========================");
+
+	for (algo = 0; algo < MAX_ALGOS; algo++) {
+		for (channel = 0; channel < NUM_CHANNELS; channel++) {
+			if (mrc_params->channel_enables & (1 << channel)) {
+				for (rank = 0; rank < NUM_RANKS; rank++) {
+					if (mrc_params->rank_enables &
+						(1 << rank)) {
+						switch (algo) {
+						case RCVN:
+							DPF(D_INFO,
+							    "\nRCVN[%02d:%02d]",
+							    channel, rank);
+							break;
+						case WDQS:
+							DPF(D_INFO,
+							    "\nWDQS[%02d:%02d]",
+							    channel, rank);
+							break;
+						case WDQX:
+							DPF(D_INFO,
+							    "\nWDQx[%02d:%02d]",
+							    channel, rank);
+							break;
+						case RDQS:
+							DPF(D_INFO,
+							    "\nRDQS[%02d:%02d]",
+							    channel, rank);
+							break;
+						case VREF:
+							DPF(D_INFO,
+							    "\nVREF[%02d:%02d]",
+							    channel, rank);
+							break;
+						case WCMD:
+							DPF(D_INFO,
+							    "\nWCMD[%02d:%02d]",
+							    channel, rank);
+							break;
+						case WCTL:
+							DPF(D_INFO,
+							    "\nWCTL[%02d:%02d]",
+							    channel, rank);
+							break;
+						case WCLK:
+							DPF(D_INFO,
+							    "\nWCLK[%02d:%02d]",
+							    channel, rank);
+							break;
+						default:
+							break;
+						}
+
+						for (bl = 0;
+						     bl < (NUM_BYTE_LANES / bl_divisor);
+						     bl++) {
+							switch (algo) {
+							case RCVN:
+								DPF(D_INFO,
+								    " %03d",
+								    get_rcvn(channel, rank, bl));
+								break;
+							case WDQS:
+								DPF(D_INFO,
+								    " %03d",
+								    get_wdqs(channel, rank, bl));
+								break;
+							case WDQX:
+								DPF(D_INFO,
+								    " %03d",
+								    get_wdq(channel, rank, bl));
+								break;
+							case RDQS:
+								DPF(D_INFO,
+								    " %03d",
+								    get_rdqs(channel, rank, bl));
+								break;
+							case VREF:
+								DPF(D_INFO,
+								    " %03d",
+								    get_vref(channel, bl));
+								break;
+							case WCMD:
+								DPF(D_INFO,
+								    " %03d",
+								    get_wcmd(channel));
+								break;
+							case WCTL:
+								DPF(D_INFO,
+								    " %03d",
+								    get_wctl(channel, rank));
+								break;
+							case WCLK:
+								DPF(D_INFO,
+								    " %03d",
+								    get_wclk(channel, rank));
+								break;
+							default:
+								break;
+							}
+						}
+					}
+				}
+			}
+		}
+	}
+
+	DPF(D_INFO, "\n---------------------------");
+	DPF(D_INFO, "\n");
+}
diff --git a/arch/x86/cpu/quark/mrc_util.h b/arch/x86/cpu/quark/mrc_util.h
new file mode 100644
index 0000000..edbe219
--- /dev/null
+++ b/arch/x86/cpu/quark/mrc_util.h
@@ -0,0 +1,153 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#ifndef _MRC_UTIL_H_
+#define _MRC_UTIL_H_
+
+/* Turn on this macro to enable MRC debugging output */
+#undef  MRC_DEBUG
+
+/* MRC Debug Support */
+#define DPF		debug_cond
+
+/* debug print type */
+
+#ifdef MRC_DEBUG
+#define D_ERROR		0x0001
+#define D_INFO		0x0002
+#define D_REGRD		0x0004
+#define D_REGWR		0x0008
+#define D_FCALL		0x0010
+#define D_TRN		0x0020
+#define D_TIME		0x0040
+#else
+#define D_ERROR		0
+#define D_INFO		0
+#define D_REGRD		0
+#define D_REGWR		0
+#define D_FCALL		0
+#define D_TRN		0
+#define D_TIME		0
+#endif
+
+#define ENTERFN(...)	debug_cond(D_FCALL, "<%s>\n", __func__)
+#define LEAVEFN(...)	debug_cond(D_FCALL, "</%s>\n", __func__)
+#define REPORTFN(...)	debug_cond(D_FCALL, "<%s/>\n", __func__)
+
+/* Generic Register Bits */
+#define BIT0		0x00000001
+#define BIT1		0x00000002
+#define BIT2		0x00000004
+#define BIT3		0x00000008
+#define BIT4		0x00000010
+#define BIT5		0x00000020
+#define BIT6		0x00000040
+#define BIT7		0x00000080
+#define BIT8		0x00000100
+#define BIT9		0x00000200
+#define BIT10		0x00000400
+#define BIT11		0x00000800
+#define BIT12		0x00001000
+#define BIT13		0x00002000
+#define BIT14		0x00004000
+#define BIT15		0x00008000
+#define BIT16		0x00010000
+#define BIT17		0x00020000
+#define BIT18		0x00040000
+#define BIT19		0x00080000
+#define BIT20		0x00100000
+#define BIT21		0x00200000
+#define BIT22		0x00400000
+#define BIT23		0x00800000
+#define BIT24		0x01000000
+#define BIT25		0x02000000
+#define BIT26		0x04000000
+#define BIT27		0x08000000
+#define BIT28		0x10000000
+#define BIT29		0x20000000
+#define BIT30		0x40000000
+#define BIT31		0x80000000
+
+/* Message Bus Port */
+#define MEM_CTLR	0x01
+#define HOST_BRIDGE	0x03
+#define MEM_MGR		0x05
+#define HTE		0x11
+#define DDRPHY		0x12
+
+/* number of sample points */
+#define SAMPLE_CNT	3
+/* number of PIs to increment per sample */
+#define SAMPLE_DLY	26
+
+enum {
+	/* indicates to decrease delays when looking for edge */
+	BACKWARD,
+	/* indicates to increase delays when looking for edge */
+	FORWARD
+};
+
+enum {
+	RCVN,
+	WDQS,
+	WDQX,
+	RDQS,
+	VREF,
+	WCMD,
+	WCTL,
+	WCLK,
+	MAX_ALGOS,
+};
+
+void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask);
+void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask);
+void mrc_post_code(uint8_t major, uint8_t minor);
+void delay_n(uint32_t ns);
+void delay_u(uint32_t ms);
+void select_mem_mgr(void);
+void select_hte(void);
+void dram_init_command(uint32_t data);
+void dram_wake_command(void);
+void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane);
+
+void set_rcvn(uint8_t channel, uint8_t rank,
+	      uint8_t byte_lane, uint32_t pi_count);
+uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane);
+void set_rdqs(uint8_t channel, uint8_t rank,
+	      uint8_t byte_lane, uint32_t pi_count);
+uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane);
+void set_wdqs(uint8_t channel, uint8_t rank,
+	      uint8_t byte_lane, uint32_t pi_count);
+uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane);
+void set_wdq(uint8_t channel, uint8_t rank,
+	     uint8_t byte_lane, uint32_t pi_count);
+uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane);
+void set_wcmd(uint8_t channel, uint32_t pi_count);
+uint32_t get_wcmd(uint8_t channel);
+void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count);
+uint32_t get_wclk(uint8_t channel, uint8_t rank);
+void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count);
+uint32_t get_wctl(uint8_t channel, uint8_t rank);
+void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting);
+uint32_t get_vref(uint8_t channel, uint8_t byte_lane);
+
+uint32_t get_addr(uint8_t channel, uint8_t rank);
+uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel,
+		    uint8_t rank, bool rcvn);
+void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[],
+		      uint8_t channel, uint8_t rank, bool rcvn);
+uint32_t byte_lane_mask(struct mrc_params *mrc_params);
+uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address);
+uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address);
+void lfsr32(uint32_t *lfsr_ptr);
+void clear_pointers(void);
+void print_timings(struct mrc_params *mrc_params);
+
+#endif /* _MRC_UTIL_H_ */
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
                   ` (3 preceding siblings ...)
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:24   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build Bin Meng
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

The codes are actually doing the memory initialization stuff.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>

---
The most ugly codes I've ever seen ...
There are 252 warnings and 127 checks in this patch, which are:

check: arch/x86/cpu/quark/smc.c,1609: Alignment should match open parenthesis
warning: arch/x86/cpu/quark/smc.c,1610: line over 80 characters
warning: arch/x86/cpu/quark/smc.c,1633: Too many leading tabs - consider code refactoring
...

Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have
all the details on how Intel's MRC codes are actually written to play with
the hardware. Trying to refactor them may lead to a non-working MRC codes.
For the 'line over 80 characters' issue, we have to leave them as is now
due to the 'Too many leading tabs ...'. If I am trying to fix the 'Alignment
should match open parenthesis' issue, I may end up adding more 'line over 80
characters' issues, so we have to bear with it. Sigh.

 arch/x86/cpu/quark/smc.c | 2764 ++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/cpu/quark/smc.h |  446 ++++++++
 2 files changed, 3210 insertions(+)
 create mode 100644 arch/x86/cpu/quark/smc.c
 create mode 100644 arch/x86/cpu/quark/smc.h

diff --git a/arch/x86/cpu/quark/smc.c b/arch/x86/cpu/quark/smc.c
new file mode 100644
index 0000000..fb389cd
--- /dev/null
+++ b/arch/x86/cpu/quark/smc.c
@@ -0,0 +1,2764 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#include <common.h>
+#include <pci.h>
+#include <asm/arch/device.h>
+#include <asm/arch/mrc.h>
+#include <asm/arch/msg_port.h>
+#include "mrc_util.h"
+#include "hte.h"
+#include "smc.h"
+
+/* t_rfc values (in picoseconds) per density */
+static const uint32_t t_rfc[5] = {
+	90000,	/* 512Mb */
+	110000,	/* 1Gb */
+	160000,	/* 2Gb */
+	300000,	/* 4Gb */
+	350000,	/* 8Gb */
+};
+
+/* t_ck clock period in picoseconds per speed index 800, 1066, 1333 */
+static const uint32_t t_ck[3] = {
+	2500,
+	1875,
+	1500
+};
+
+/* Global variables */
+static const uint16_t ddr_wclk[] = {193, 158};
+static const uint16_t ddr_wctl[] = {1, 217};
+static const uint16_t ddr_wcmd[] = {1, 220};
+
+#ifdef BACKUP_RCVN
+static const uint16_t ddr_rcvn[] = {129, 498};
+#endif
+
+#ifdef BACKUP_WDQS
+static const uint16_t ddr_wdqs[] = {65, 289};
+#endif
+
+#ifdef BACKUP_RDQS
+static const uint8_t ddr_rdqs[] = {32, 24};
+#endif
+
+#ifdef BACKUP_WDQ
+static const uint16_t ddr_wdq[] = {32, 257};
+#endif
+
+/* Stop self refresh driven by MCU */
+void clear_self_refresh(struct mrc_params *mrc_params)
+{
+	ENTERFN();
+
+	/* clear the PMSTS Channel Self Refresh bits */
+	mrc_write_mask(MEM_CTLR, PMSTS, BIT0, BIT0);
+
+	LEAVEFN();
+}
+
+/* It will initialise timing registers in the MCU (DTR0..DTR4) */
+void prog_ddr_timing_control(struct mrc_params *mrc_params)
+{
+	uint8_t tcl, wl;
+	uint8_t trp, trcd, tras, twr, twtr, trrd, trtp, tfaw;
+	uint32_t tck;
+	u32 dtr0, dtr1, dtr2, dtr3, dtr4;
+	u32 tmp1, tmp2;
+
+	ENTERFN();
+
+	/* mcu_init starts */
+	mrc_post_code(0x02, 0x00);
+
+	dtr0 = msg_port_read(MEM_CTLR, DTR0);
+	dtr1 = msg_port_read(MEM_CTLR, DTR1);
+	dtr2 = msg_port_read(MEM_CTLR, DTR2);
+	dtr3 = msg_port_read(MEM_CTLR, DTR3);
+	dtr4 = msg_port_read(MEM_CTLR, DTR4);
+
+	tck = t_ck[mrc_params->ddr_speed];	/* Clock in picoseconds */
+	tcl = mrc_params->params.cl;		/* CAS latency in clocks */
+	trp = tcl;	/* Per CAT MRC */
+	trcd = tcl;	/* Per CAT MRC */
+	tras = MCEIL(mrc_params->params.ras, tck);
+
+	/* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */
+	twr = MCEIL(15000, tck);
+
+	twtr = MCEIL(mrc_params->params.wtr, tck);
+	trrd = MCEIL(mrc_params->params.rrd, tck);
+	trtp = 4;	/* Valid for 800 and 1066, use 5 for 1333 */
+	tfaw = MCEIL(mrc_params->params.faw, tck);
+
+	wl = 5 + mrc_params->ddr_speed;
+
+	dtr0 &= ~(BIT0 | BIT1);
+	dtr0 |= mrc_params->ddr_speed;
+	dtr0 &= ~(BIT12 | BIT13 | BIT14);
+	tmp1 = tcl - 5;
+	dtr0 |= ((tcl - 5) << 12);
+	dtr0 &= ~(BIT4 | BIT5 | BIT6 | BIT7);
+	dtr0 |= ((trp - 5) << 4);	/* 5 bit DRAM Clock */
+	dtr0 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
+	dtr0 |= ((trcd - 5) << 8);	/* 5 bit DRAM Clock */
+
+	dtr1 &= ~(BIT0 | BIT1 | BIT2);
+	tmp2 = wl - 3;
+	dtr1 |= (wl - 3);
+	dtr1 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
+	dtr1 |= ((wl + 4 + twr - 14) << 8);	/* Change to tWTP */
+	dtr1 &= ~(BIT28 | BIT29 | BIT30);
+	dtr1 |= ((MMAX(trtp, 4) - 3) << 28);	/* 4 bit DRAM Clock */
+	dtr1 &= ~(BIT24 | BIT25);
+	dtr1 |= ((trrd - 4) << 24);		/* 4 bit DRAM Clock */
+	dtr1 &= ~(BIT4 | BIT5);
+	dtr1 |= (1 << 4);
+	dtr1 &= ~(BIT20 | BIT21 | BIT22 | BIT23);
+	dtr1 |= ((tras - 14) << 20);		/* 6 bit DRAM Clock */
+	dtr1 &= ~(BIT16 | BIT17 | BIT18 | BIT19);
+	dtr1 |= ((((tfaw + 1) >> 1) - 5) << 16);/* 4 bit DRAM Clock */
+	/* Set 4 Clock CAS to CAS delay (multi-burst) */
+	dtr1 &= ~(BIT12 | BIT13);
+
+	dtr2 &= ~(BIT0 | BIT1 | BIT2);
+	dtr2 |= 1;
+	dtr2 &= ~(BIT8 | BIT9 | BIT10);
+	dtr2 |= (2 << 8);
+	dtr2 &= ~(BIT16 | BIT17 | BIT18 | BIT19);
+	dtr2 |= (2 << 16);
+
+	dtr3 &= ~(BIT0 | BIT1 | BIT2);
+	dtr3 |= 2;
+	dtr3 &= ~(BIT4 | BIT5 | BIT6);
+	dtr3 |= (2 << 4);
+
+	dtr3 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
+	if (mrc_params->ddr_speed == DDRFREQ_800) {
+		/* Extended RW delay (+1) */
+		dtr3 |= ((tcl - 5 + 1) << 8);
+	} else if (mrc_params->ddr_speed == DDRFREQ_1066) {
+		/* Extended RW delay (+1) */
+		dtr3 |= ((tcl - 5 + 1) << 8);
+	}
+
+	dtr3 &= ~(BIT13 | BIT14 | BIT15 | BIT16);
+	dtr3 |= ((4 + wl + twtr - 11) << 13);
+
+	dtr3 &= ~(BIT22 | BIT23);
+	if (mrc_params->ddr_speed == DDRFREQ_800)
+		dtr3 |= ((MMAX(0, 1 - 1)) << 22);
+	else
+		dtr3 |= ((MMAX(0, 2 - 1)) << 22);
+
+	dtr4 &= ~(BIT0 | BIT1);
+	dtr4 |= 1;
+	dtr4 &= ~(BIT4 | BIT5 | BIT6);
+	dtr4 |= (1 << 4);
+	dtr4 &= ~(BIT8 | BIT9 | BIT10);
+	dtr4 |= ((1 + tmp1 - tmp2 + 2) << 8);
+	dtr4 &= ~(BIT12 | BIT13 | BIT14);
+	dtr4 |= ((1 + tmp1 - tmp2 + 2) << 12);
+	dtr4 &= ~(BIT15 | BIT16);
+
+	msg_port_write(MEM_CTLR, DTR0, dtr0);
+	msg_port_write(MEM_CTLR, DTR1, dtr1);
+	msg_port_write(MEM_CTLR, DTR2, dtr2);
+	msg_port_write(MEM_CTLR, DTR3, dtr3);
+	msg_port_write(MEM_CTLR, DTR4, dtr4);
+
+	LEAVEFN();
+}
+
+/* Configure MCU before jedec init sequence */
+void prog_decode_before_jedec(struct mrc_params *mrc_params)
+{
+	u32 drp;
+	u32 drfc;
+	u32 dcal;
+	u32 dsch;
+	u32 dpmc0;
+
+	ENTERFN();
+
+	/* Disable power saving features */
+	dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
+	dpmc0 |= (BIT24 | BIT25);
+	dpmc0 &= ~(BIT16 | BIT17 | BIT18);
+	dpmc0 &= ~BIT23;
+	msg_port_write(MEM_CTLR, DPMC0, dpmc0);
+
+	/* Disable out of order transactions */
+	dsch = msg_port_read(MEM_CTLR, DSCH);
+	dsch |= (BIT8 | BIT12);
+	msg_port_write(MEM_CTLR, DSCH, dsch);
+
+	/* Disable issuing the REF command */
+	drfc = msg_port_read(MEM_CTLR, DRFC);
+	drfc &= ~(BIT12 | BIT13 | BIT14);
+	msg_port_write(MEM_CTLR, DRFC, drfc);
+
+	/* Disable ZQ calibration short */
+	dcal = msg_port_read(MEM_CTLR, DCAL);
+	dcal &= ~(BIT8 | BIT9 | BIT10);
+	dcal &= ~(BIT12 | BIT13);
+	msg_port_write(MEM_CTLR, DCAL, dcal);
+
+	/*
+	 * Training performed in address mode 0, rank population has limited
+	 * impact, however simulator complains if enabled non-existing rank.
+	 */
+	drp = 0;
+	if (mrc_params->rank_enables & 1)
+		drp |= BIT0;
+	if (mrc_params->rank_enables & 2)
+		drp |= BIT1;
+	msg_port_write(MEM_CTLR, DRP, drp);
+
+	LEAVEFN();
+}
+
+/*
+ * After Cold Reset, BIOS should set COLDWAKE bit to 1 before
+ * sending the WAKE message to the Dunit.
+ *
+ * For Standby Exit, or any other mode in which the DRAM is in
+ * SR, this bit must be set to 0.
+ */
+void perform_ddr_reset(struct mrc_params *mrc_params)
+{
+	ENTERFN();
+
+	/* Set COLDWAKE bit before sending the WAKE message */
+	mrc_write_mask(MEM_CTLR, DRMC, BIT16, BIT16);
+
+	/* Send wake command to DUNIT (MUST be done before JEDEC) */
+	dram_wake_command();
+
+	/* Set default value */
+	msg_port_write(MEM_CTLR, DRMC,
+		       (mrc_params->rd_odt_value == 0 ? BIT12 : 0));
+
+	LEAVEFN();
+}
+
+
+/*
+ * This function performs some initialization on the DDRIO unit.
+ * This function is dependent on BOARD_ID, DDR_SPEED, and CHANNEL_ENABLES.
+ */
+void ddrphy_init(struct mrc_params *mrc_params)
+{
+	uint32_t temp;
+	uint8_t ch;	/* channel counter */
+	uint8_t rk;	/* rank counter */
+	uint8_t bl_grp;	/*  byte lane group counter (2 BLs per module) */
+	uint8_t bl_divisor = 1;	/* byte lane divisor */
+	/* For DDR3 --> 0 == 800, 1 == 1066, 2 == 1333 */
+	uint8_t speed = mrc_params->ddr_speed & (BIT1 | BIT0);
+	uint8_t cas;
+	uint8_t cwl;
+
+	ENTERFN();
+
+	cas = mrc_params->params.cl;
+	cwl = 5 + mrc_params->ddr_speed;
+
+	/* ddrphy_init starts */
+	mrc_post_code(0x03, 0x00);
+
+	/*
+	 * HSD#231531
+	 * Make sure IOBUFACT is deasserted before initializing the DDR PHY
+	 *
+	 * HSD#234845
+	 * Make sure WRPTRENABLE is deasserted before initializing the DDR PHY
+	 */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* Deassert DDRPHY Initialization Complete */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				~BIT20, BIT20);	/* SPID_INIT_COMPLETE=0 */
+			/* Deassert IOBUFACT */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				~BIT2, BIT2);	/* IOBUFACTRST_N=0 */
+			/* Disable WRPTR */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)),
+				~BIT0, BIT0);	/* WRPTRENABLE=0 */
+		}
+	}
+
+	/* Put PHY in reset */
+	mrc_alt_write_mask(DDRPHY, MASTERRSTN, 0, BIT0);
+
+	/* Initialize DQ01, DQ23, CMD, CLK-CTL, COMP modules */
+
+	/* STEP0 */
+	mrc_post_code(0x03, 0x10);
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* DQ01-DQ23 */
+			for (bl_grp = 0;
+			     bl_grp < ((NUM_BYTE_LANES / bl_divisor)/2);
+			     bl_grp++) {
+				/* Analog MUX select - IO2xCLKSEL */
+				mrc_alt_write_mask(DDRPHY,
+					(DQOBSCKEBBCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					((bl_grp) ? (0x00) : (BIT22)), (BIT22));
+
+				/* ODT Strength */
+				switch (mrc_params->rd_odt_value) {
+				case 1:
+					temp = 0x3;
+					break;	/* 60 ohm */
+				case 2:
+					temp = 0x3;
+					break;	/* 120 ohm */
+				case 3:
+					temp = 0x3;
+					break;	/* 180 ohm */
+				default:
+					temp = 0x3;
+					break;	/* 120 ohm */
+				}
+
+				/* ODT strength */
+				mrc_alt_write_mask(DDRPHY,
+					(B0RXIOBUFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(temp << 5), (BIT6 | BIT5));
+				/* ODT strength */
+				mrc_alt_write_mask(DDRPHY,
+					(B1RXIOBUFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(temp << 5), (BIT6 | BIT5));
+
+				/* Dynamic ODT/DIFFAMP */
+				temp = (((cas) << 24) | ((cas) << 16) |
+					((cas) << 8) | ((cas) << 0));
+				switch (speed) {
+				case 0:
+					temp -= 0x01010101;
+					break;	/* 800 */
+				case 1:
+					temp -= 0x02020202;
+					break;	/* 1066 */
+				case 2:
+					temp -= 0x03030303;
+					break;	/* 1333 */
+				case 3:
+					temp -= 0x04040404;
+					break;	/* 1600 */
+				}
+
+				/* Launch Time: ODT, DIFFAMP, ODT, DIFFAMP */
+				mrc_alt_write_mask(DDRPHY,
+					(B01LATCTL1 +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					temp,
+					(BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
+					BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
+					BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+					BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
+				switch (speed) {
+				/* HSD#234715 */
+				case 0:
+					temp = ((0x06 << 16) | (0x07 << 8));
+					break;	/* 800 */
+				case 1:
+					temp = ((0x07 << 16) | (0x08 << 8));
+					break;	/* 1066 */
+				case 2:
+					temp = ((0x09 << 16) | (0x0A << 8));
+					break;	/* 1333 */
+				case 3:
+					temp = ((0x0A << 16) | (0x0B << 8));
+					break;	/* 1600 */
+				}
+
+				/* On Duration: ODT, DIFFAMP */
+				mrc_alt_write_mask(DDRPHY,
+					(B0ONDURCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					temp,
+					(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+					BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
+					BIT9 | BIT8));
+				/* On Duration: ODT, DIFFAMP */
+				mrc_alt_write_mask(DDRPHY,
+					(B1ONDURCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					temp,
+					(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+					BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
+					BIT9 | BIT8));
+
+				switch (mrc_params->rd_odt_value) {
+				case 0:
+					/* override DIFFAMP=on, ODT=off */
+					temp = ((0x3F << 16) | (0x3f << 10));
+					break;
+				default:
+					/* override DIFFAMP=on, ODT=on */
+					temp = ((0x3F << 16) | (0x2A << 10));
+					break;
+				}
+
+				/* Override: DIFFAMP, ODT */
+				mrc_alt_write_mask(DDRPHY,
+					(B0OVRCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					temp,
+					(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+					BIT16 | BIT15 | BIT14 | BIT13 | BIT12 |
+					BIT11 | BIT10));
+				/* Override: DIFFAMP, ODT */
+				mrc_alt_write_mask(DDRPHY,
+					(B1OVRCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					temp,
+					(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+					BIT16 | BIT15 | BIT14 | BIT13 | BIT12 |
+					BIT11 | BIT10));
+
+				/* DLL Setup */
+
+				/* 1xCLK Domain Timings: tEDP,RCVEN,WDQS (PO) */
+				mrc_alt_write_mask(DDRPHY,
+					(B0LATCTL0 +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(((cas + 7) << 16) | ((cas - 4) << 8) |
+					((cwl - 2) << 0)),
+					(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+					BIT16 | BIT12 | BIT11 | BIT10 | BIT9 |
+					BIT8 | BIT4 | BIT3 | BIT2 | BIT1 |
+					BIT0));
+				mrc_alt_write_mask(DDRPHY,
+					(B1LATCTL0 +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(((cas + 7) << 16) | ((cas - 4) << 8) |
+					((cwl - 2) << 0)),
+					(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+					BIT16 | BIT12 | BIT11 | BIT10 | BIT9 |
+					BIT8 | BIT4 | BIT3 | BIT2 | BIT1 |
+					BIT0));
+
+				/* RCVEN Bypass (PO) */
+				mrc_alt_write_mask(DDRPHY,
+					(B0RXIOBUFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					((0x0 << 7) | (0x0 << 0)),
+					(BIT7 | BIT0));
+				mrc_alt_write_mask(DDRPHY,
+					(B1RXIOBUFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					((0x0 << 7) | (0x0 << 0)),
+					(BIT7 | BIT0));
+
+				/* TX */
+				mrc_alt_write_mask(DDRPHY,
+					(DQCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(BIT16), (BIT16));
+				mrc_alt_write_mask(DDRPHY,
+					(B01PTRCTL1 +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(BIT8), (BIT8));
+
+				/* RX (PO) */
+				/* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */
+				mrc_alt_write_mask(DDRPHY,
+					(B0VREFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					((0x03 << 2) | (0x0 << 1) | (0x0 << 0)),
+					(BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
+					BIT2 | BIT1 | BIT0));
+				/* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */
+				mrc_alt_write_mask(DDRPHY,
+					(B1VREFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					((0x03 << 2) | (0x0 << 1) | (0x0 << 0)),
+					(BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
+					BIT2 | BIT1 | BIT0));
+				/* Per-Bit De-Skew Enable */
+				mrc_alt_write_mask(DDRPHY,
+					(B0RXIOBUFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(0), (BIT4));
+				/* Per-Bit De-Skew Enable */
+				mrc_alt_write_mask(DDRPHY,
+					(B1RXIOBUFCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(0), (BIT4));
+			}
+
+			/* CLKEBB */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				0, (BIT23));
+
+			/* Enable tristate control of cmd/address bus */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				0, (BIT1 | BIT0));
+
+			/* ODT RCOMP */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x03 << 5) | (0x03 << 0)),
+				(BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 |
+				BIT3 | BIT2 | BIT1 | BIT0));
+
+			/* CMDPM* registers must be programmed in this order */
+
+			/* Turn On Delays: SFR (regulator), MPLL */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMDLYREG4 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0xFFFFU << 16) | (0xFFFF << 0)),
+				0xFFFFFFFF);
+			/*
+			 * Delays: ASSERT_IOBUFACT_to_ALLON0_for_PM_MSG_3,
+			 * VREG (MDLL) Turn On, ALLON0_to_DEASSERT_IOBUFACT
+			 * for_PM_MSG_gt0, MDLL Turn On
+			 */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMDLYREG3 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0xFU << 28) | (0xFFF << 16) | (0xF << 12) |
+				(0x616 << 0)), 0xFFFFFFFF);
+			/* MPLL Divider Reset Delays */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMDLYREG2 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
+				(0xFF << 0)), 0xFFFFFFFF);
+			/* Turn Off Delays: VREG, Staggered MDLL, MDLL, PI */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMDLYREG1 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
+				(0xFF << 0)), 0xFFFFFFFF);
+			/* Turn On Delays: MPLL, Staggered MDLL, PI, IOBUFACT */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMDLYREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
+				(0xFF << 0)), 0xFFFFFFFF);
+			/* Allow PUnit signals */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x6 << 8) | BIT6 | (0x4 << 0)),
+				(BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 |
+				BIT25 | BIT24 | BIT23 | BIT22 | BIT21 | BIT11 |
+				BIT10 | BIT9 | BIT8 | BIT6 | BIT3 | BIT2 |
+				BIT1 | BIT0));
+			/* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x3 << 4) | (0x7 << 0)),
+				(BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 |
+				BIT0));
+
+			/* CLK-CTL */
+			mrc_alt_write_mask(DDRPHY,
+				(CCOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				0, BIT24);	/* CLKEBB */
+			/* Buffer Enable: CS,CKE,ODT,CLK */
+			mrc_alt_write_mask(DDRPHY,
+				(CCCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x0 << 16) | (0x0 << 12) | (0x0 << 8) |
+				(0xF << 4) | BIT0),
+				(BIT19 | BIT18 | BIT17 | BIT16 | BIT15 | BIT14 |
+				BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+				BIT7 | BIT6 | BIT5 | BIT4 | BIT0));
+			/* ODT RCOMP */
+			mrc_alt_write_mask(DDRPHY,
+				(CCRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x03 << 8) | (0x03 << 0)),
+				(BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | BIT4 |
+				BIT3 | BIT2 | BIT1 | BIT0));
+			/* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */
+			mrc_alt_write_mask(DDRPHY,
+				(CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x3 << 4) | (0x7 << 0)),
+				(BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 |
+				BIT0));
+
+			/*
+			 * COMP (RON channel specific)
+			 * - DQ/DQS/DM RON: 32 Ohm
+			 * - CTRL/CMD RON: 27 Ohm
+			 * - CLK RON: 26 Ohm
+			 */
+			/* RCOMP Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQVREFCH0 +  (ch * DDRCOMP_CH_OFFSET)),
+				((0x08 << 24) | (0x03 << 16)),
+				(BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
+				BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
+				BIT17 | BIT16));
+			/* RCOMP Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x0C << 24) | (0x03 << 16)),
+				(BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
+				BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
+				BIT17 | BIT16));
+			/* RCOMP Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x0F << 24) | (0x03 << 16)),
+				(BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
+				BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
+				BIT17 | BIT16));
+			/* RCOMP Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x08 << 24) | (0x03 << 16)),
+				(BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
+				BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
+				BIT17 | BIT16));
+			/* RCOMP Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CTLVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x0C << 24) | (0x03 << 16)),
+				(BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
+				BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
+				BIT17 | BIT16));
+
+			/* DQS Swapped Input Enable */
+			mrc_alt_write_mask(DDRPHY,
+				(COMPEN1CH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT19 | BIT17),
+				(BIT31 | BIT30 | BIT19 | BIT17 |
+				BIT15 | BIT14));
+
+			/* ODT VREF = 1.5 x 274/360+274 = 0.65V (code of ~50) */
+			/* ODT Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x32 << 8) | (0x03 << 0)),
+				(BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+				BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
+			/* ODT Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x32 << 8) | (0x03 << 0)),
+				(BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+				BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
+			/* ODT Vref PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x0E << 8) | (0x05 << 0)),
+				(BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+				BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
+
+			/*
+			 * Slew rate settings are frequency specific,
+			 * numbers below are for 800Mhz (speed == 0)
+			 * - DQ/DQS/DM/CLK SR: 4V/ns,
+			 * - CTRL/CMD SR: 1.5V/ns
+			 */
+			temp = (0x0E << 16) | (0x0E << 12) | (0x08 << 8) |
+				(0x0B << 4) | (0x0B << 0);
+			/* DCOMP Delay Select: CTL,CMD,CLK,DQS,DQ */
+			mrc_alt_write_mask(DDRPHY,
+				(DLYSELCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				temp,
+				(BIT19 | BIT18 | BIT17 | BIT16 | BIT15 |
+				BIT14 | BIT13 | BIT12 | BIT11 | BIT10 |
+				BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 |
+				BIT3 | BIT2 | BIT1 | BIT0));
+			/* TCO Vref CLK,DQS,DQ */
+			mrc_alt_write_mask(DDRPHY,
+				(TCOVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x05 << 16) | (0x05 << 8) | (0x05 << 0)),
+				(BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+				BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
+				BIT9 | BIT8 | BIT5 | BIT4 | BIT3 | BIT2 |
+				BIT1 | BIT0));
+			/* ODTCOMP CMD/CTL PU/PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CCBUFODTCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				((0x03 << 8) | (0x03 << 0)),
+				(BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
+				BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
+			/* COMP */
+			mrc_alt_write_mask(DDRPHY,
+				(COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
+				0, (BIT31 | BIT30 | BIT8));
+
+#ifdef BACKUP_COMPS
+			/* DQ COMP Overrides */
+			/* RCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* RCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x10 << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x10 << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* ODTCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* ODTCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* TCOCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31), (BIT31));
+			/* TCOCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31), (BIT31));
+
+			/* DQS COMP Overrides */
+			/* RCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* RCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x10 << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x10 << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* ODTCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* ODTCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* TCOCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31), (BIT31));
+			/* TCOCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31), (BIT31));
+
+			/* CLK COMP Overrides */
+			/* RCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0C << 16)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* RCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0C << 16)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x07 << 16)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x07 << 16)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* ODTCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* ODTCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | (0x0B << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* TCOCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31), (BIT31));
+			/* TCOCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31), (BIT31));
+
+			/* CMD COMP Overrides */
+			/* RCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0D << 16)),
+				(BIT31 | BIT21 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* RCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0D << 16)),
+				(BIT31 | BIT21 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+
+			/* CTL COMP Overrides */
+			/* RCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CTLDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0D << 16)),
+				(BIT31 | BIT21 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* RCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CTLDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0D << 16)),
+				(BIT31 | BIT21 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CTLDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* DCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CTLDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x0A << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+#else
+			/* DQ TCOCOMP Overrides */
+			/* TCOCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x1F << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* TCOCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x1F << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+
+			/* DQS TCOCOMP Overrides */
+			/* TCOCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x1F << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* TCOCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x1F << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+
+			/* CLK TCOCOMP Overrides */
+			/* TCOCOMP PU */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x1F << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+			/* TCOCOMP PD */
+			mrc_alt_write_mask(DDRPHY,
+				(CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
+				(BIT31 | (0x1F << 16)),
+				(BIT31 | BIT20 | BIT19 |
+				BIT18 | BIT17 | BIT16));
+#endif
+
+			/* program STATIC delays */
+#ifdef BACKUP_WCMD
+			set_wcmd(ch, ddr_wcmd[PLATFORM_ID]);
+#else
+			set_wcmd(ch, ddr_wclk[PLATFORM_ID] + HALF_CLK);
+#endif
+
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1<<rk)) {
+					set_wclk(ch, rk, ddr_wclk[PLATFORM_ID]);
+#ifdef BACKUP_WCTL
+					set_wctl(ch, rk, ddr_wctl[PLATFORM_ID]);
+#else
+					set_wctl(ch, rk, ddr_wclk[PLATFORM_ID] + HALF_CLK);
+#endif
+				}
+			}
+		}
+	}
+
+	/* COMP (non channel specific) */
+	/* RCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANADRVPUCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANADRVPDCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CMDANADRVPUCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CMDANADRVPDCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANADRVPUCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANADRVPDCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANADRVPUCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANADRVPDCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CTLANADRVPUCTL), (BIT30), (BIT30));
+	/* RCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CTLANADRVPDCTL), (BIT30), (BIT30));
+	/* ODT: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANAODTPUCTL), (BIT30), (BIT30));
+	/* ODT: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANAODTPDCTL), (BIT30), (BIT30));
+	/* ODT: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANAODTPUCTL), (BIT30), (BIT30));
+	/* ODT: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANAODTPDCTL), (BIT30), (BIT30));
+	/* ODT: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANAODTPUCTL), (BIT30), (BIT30));
+	/* ODT: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANAODTPDCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANADLYPUCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANADLYPDCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CMDANADLYPUCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CMDANADLYPDCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANADLYPUCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANADLYPDCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANADLYPUCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANADLYPDCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CTLANADLYPUCTL), (BIT30), (BIT30));
+	/* DCOMP: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CTLANADLYPDCTL), (BIT30), (BIT30));
+	/* TCO: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANATCOPUCTL), (BIT30), (BIT30));
+	/* TCO: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQANATCOPDCTL), (BIT30), (BIT30));
+	/* TCO: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANATCOPUCTL), (BIT30), (BIT30));
+	/* TCO: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (CLKANATCOPDCTL), (BIT30), (BIT30));
+	/* TCO: Dither PU Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANATCOPUCTL), (BIT30), (BIT30));
+	/* TCO: Dither PD Enable */
+	mrc_alt_write_mask(DDRPHY, (DQSANATCOPDCTL), (BIT30), (BIT30));
+	/* TCOCOMP: Pulse Count */
+	mrc_alt_write_mask(DDRPHY, (TCOCNTCTRL), (0x1<<0), (BIT1|BIT0));
+	/* ODT: CMD/CTL PD/PU */
+	mrc_alt_write_mask(DDRPHY,
+		(CHNLBUFSTATIC), ((0x03<<24)|(0x03<<16)),
+		(BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
+		BIT20 | BIT19 | BIT18 | BIT17 | BIT16));
+	/* Set 1us counter */
+	mrc_alt_write_mask(DDRPHY,
+		(MSCNTR), (0x64 << 0),
+		(BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
+	mrc_alt_write_mask(DDRPHY,
+		(LATCH1CTL), (0x1 << 28),
+		(BIT30 | BIT29 | BIT28));
+
+	/* Release PHY from reset */
+	mrc_alt_write_mask(DDRPHY, MASTERRSTN, BIT0, BIT0);
+
+	/* STEP1 */
+	mrc_post_code(0x03, 0x11);
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* DQ01-DQ23 */
+			for (bl_grp = 0;
+			     bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
+			     bl_grp++) {
+				mrc_alt_write_mask(DDRPHY,
+					(DQMDLLCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(BIT13),
+					(BIT13));	/* Enable VREG */
+				delay_n(3);
+			}
+
+			/* ECC */
+			mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL),
+				(BIT13), (BIT13));	/* Enable VREG */
+			delay_n(3);
+			/* CMD */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				(BIT13), (BIT13));	/* Enable VREG */
+			delay_n(3);
+			/* CLK-CTL */
+			mrc_alt_write_mask(DDRPHY,
+				(CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				(BIT13), (BIT13));	/* Enable VREG */
+			delay_n(3);
+		}
+	}
+
+	/* STEP2 */
+	mrc_post_code(0x03, 0x12);
+	delay_n(200);
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* DQ01-DQ23 */
+			for (bl_grp = 0;
+			     bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
+			     bl_grp++) {
+				mrc_alt_write_mask(DDRPHY,
+					(DQMDLLCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(BIT17),
+					(BIT17));	/* Enable MCDLL */
+				delay_n(50);
+			}
+
+		/* ECC */
+		mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL),
+			(BIT17), (BIT17));	/* Enable MCDLL */
+		delay_n(50);
+		/* CMD */
+		mrc_alt_write_mask(DDRPHY,
+			(CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
+			(BIT18), (BIT18));	/* Enable MCDLL */
+		delay_n(50);
+		/* CLK-CTL */
+		mrc_alt_write_mask(DDRPHY,
+			(CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
+			(BIT18), (BIT18));	/* Enable MCDLL */
+		delay_n(50);
+		}
+	}
+
+	/* STEP3: */
+	mrc_post_code(0x03, 0x13);
+	delay_n(100);
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* DQ01-DQ23 */
+			for (bl_grp = 0;
+			     bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
+			     bl_grp++) {
+#ifdef FORCE_16BIT_DDRIO
+				temp = ((bl_grp) &&
+					(mrc_params->channel_width == X16)) ?
+					((0x1 << 12) | (0x1 << 8) |
+					(0xF << 4) | (0xF << 0)) :
+					((0xF << 12) | (0xF << 8) |
+					(0xF << 4) | (0xF << 0));
+#else
+				temp = ((0xF << 12) | (0xF << 8) |
+					(0xF << 4) | (0xF << 0));
+#endif
+				/* Enable TXDLL */
+				mrc_alt_write_mask(DDRPHY,
+					(DQDLLTXCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					temp, 0xFFFF);
+				delay_n(3);
+				/* Enable RXDLL */
+				mrc_alt_write_mask(DDRPHY,
+					(DQDLLRXCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(BIT3 | BIT2 | BIT1 | BIT0),
+					(BIT3 | BIT2 | BIT1 | BIT0));
+				delay_n(3);
+				/* Enable RXDLL Overrides BL0 */
+				mrc_alt_write_mask(DDRPHY,
+					(B0OVRCTL +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(BIT3 | BIT2 | BIT1 | BIT0),
+					(BIT3 | BIT2 | BIT1 | BIT0));
+			}
+
+			/* ECC */
+			temp = ((0xF << 12) | (0xF << 8) |
+				(0xF << 4) | (0xF << 0));
+			mrc_alt_write_mask(DDRPHY, (ECCDLLTXCTL),
+				temp, 0xFFFF);
+			delay_n(3);
+
+			/* CMD (PO) */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDDLLTXCTL + (ch * DDRIOCCC_CH_OFFSET)),
+				temp, 0xFFFF);
+			delay_n(3);
+		}
+	}
+
+	/* STEP4 */
+	mrc_post_code(0x03, 0x14);
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* Host To Memory Clock Alignment (HMC) for 800/1066 */
+			for (bl_grp = 0;
+			     bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
+			     bl_grp++) {
+				/* CLK_ALIGN_MOD_ID */
+				mrc_alt_write_mask(DDRPHY,
+					(DQCLKALIGNREG2 +
+					(bl_grp * DDRIODQ_BL_OFFSET) +
+					(ch * DDRIODQ_CH_OFFSET)),
+					(bl_grp) ? (0x3) : (0x1),
+					(BIT3 | BIT2 | BIT1 | BIT0));
+			}
+
+			mrc_alt_write_mask(DDRPHY,
+				(ECCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
+				0x2,
+				(BIT3 | BIT2 | BIT1 | BIT0));
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
+				0x0,
+				(BIT3 | BIT2 | BIT1 | BIT0));
+			mrc_alt_write_mask(DDRPHY,
+				(CCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
+				0x2,
+				(BIT3 | BIT2 | BIT1 | BIT0));
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				(0x2 << 4), (BIT5 | BIT4));
+			/*
+			 * NUM_SAMPLES, MAX_SAMPLES,
+			 * MACRO_PI_STEP, MICRO_PI_STEP
+			 */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCLKALIGNREG1 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x18 << 16) | (0x10 << 8) |
+				(0x8 << 2) | (0x1 << 0)),
+				(BIT22 | BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
+				BIT16 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 |
+				BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
+				BIT2 | BIT1 | BIT0));
+			/* TOTAL_NUM_MODULES, FIRST_U_PARTITION */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCLKALIGNREG2 + (ch * DDRIOCCC_CH_OFFSET)),
+				((0x10 << 16) | (0x4 << 8) | (0x2 << 4)),
+				(BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
+				BIT11 | BIT10 | BIT9 | BIT8 | BIT7 | BIT6 |
+				BIT5 | BIT4));
+#ifdef HMC_TEST
+			/* START_CLK_ALIGN=1 */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				BIT24, BIT24);
+			while (msg_port_alt_read(DDRPHY,
+				(CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET))) &
+				BIT24)
+				;	/* wait for START_CLK_ALIGN=0 */
+#endif
+
+			/* Set RD/WR Pointer Seperation & COUNTEN & FIFOPTREN */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)),
+				BIT0, BIT0);	/* WRPTRENABLE=1 */
+
+			/* COMP initial */
+			/* enable bypass for CLK buffer (PO) */
+			mrc_alt_write_mask(DDRPHY,
+				(COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
+				BIT5, BIT5);
+			/* Initial COMP Enable */
+			mrc_alt_write_mask(DDRPHY, (CMPCTRL),
+				(BIT0), (BIT0));
+			/* wait for Initial COMP Enable = 0 */
+			while (msg_port_alt_read(DDRPHY, (CMPCTRL)) & BIT0)
+				;
+			/* disable bypass for CLK buffer (PO) */
+			mrc_alt_write_mask(DDRPHY,
+				(COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
+				~BIT5, BIT5);
+
+			/* IOBUFACT */
+
+			/* STEP4a */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				BIT2, BIT2);	/* IOBUFACTRST_N=1 */
+
+			/* DDRPHY initialisation complete */
+			mrc_alt_write_mask(DDRPHY,
+				(CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
+				BIT20, BIT20);	/* SPID_INIT_COMPLETE=1 */
+		}
+	}
+
+	LEAVEFN();
+}
+
+/* This function performs JEDEC initialisation on all enabled channels */
+void perform_jedec_init(struct mrc_params *mrc_params)
+{
+	uint8_t twr, wl, rank;
+	uint32_t tck;
+	u32 dtr0;
+	u32 drp;
+	u32 drmc;
+	u32 mrs0_cmd = 0;
+	u32 emrs1_cmd = 0;
+	u32 emrs2_cmd = 0;
+	u32 emrs3_cmd = 0;
+
+	ENTERFN();
+
+	/* jedec_init starts */
+	mrc_post_code(0x04, 0x00);
+
+	/* DDR3_RESET_SET=0, DDR3_RESET_RESET=1 */
+	mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT1, (BIT8 | BIT1));
+
+	/* Assert RESET# for 200us */
+	delay_u(200);
+
+	/* DDR3_RESET_SET=1, DDR3_RESET_RESET=0 */
+	mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT8, (BIT8 | BIT1));
+
+	dtr0 = msg_port_read(MEM_CTLR, DTR0);
+
+	/*
+	 * Set CKEVAL for populated ranks
+	 * then send NOP to each rank (#4550197)
+	 */
+
+	drp = msg_port_read(MEM_CTLR, DRP);
+	drp &= 0x3;
+
+	drmc = msg_port_read(MEM_CTLR, DRMC);
+	drmc &= 0xFFFFFFFC;
+	drmc |= (BIT4 | drp);
+
+	msg_port_write(MEM_CTLR, DRMC, drmc);
+
+	for (rank = 0; rank < NUM_RANKS; rank++) {
+		/* Skip to next populated rank */
+		if ((mrc_params->rank_enables & (1 << rank)) == 0)
+			continue;
+
+		dram_init_command(DCMD_NOP(rank));
+	}
+
+	msg_port_write(MEM_CTLR, DRMC,
+		(mrc_params->rd_odt_value == 0 ? BIT12 : 0));
+
+	/*
+	 * setup for emrs 2
+	 * BIT[15:11] --> Always "0"
+	 * BIT[10:09] --> Rtt_WR: want "Dynamic ODT Off" (0)
+	 * BIT[08]    --> Always "0"
+	 * BIT[07]    --> SRT: use sr_temp_range
+	 * BIT[06]    --> ASR: want "Manual SR Reference" (0)
+	 * BIT[05:03] --> CWL: use oem_tCWL
+	 * BIT[02:00] --> PASR: want "Full Array" (0)
+	 */
+	emrs2_cmd |= (2 << 3);
+	wl = 5 + mrc_params->ddr_speed;
+	emrs2_cmd |= ((wl - 5) << 9);
+	emrs2_cmd |= (mrc_params->sr_temp_range << 13);
+
+	/*
+	 * setup for emrs 3
+	 * BIT[15:03] --> Always "0"
+	 * BIT[02]    --> MPR: want "Normal Operation" (0)
+	 * BIT[01:00] --> MPR_Loc: want "Predefined Pattern" (0)
+	 */
+	emrs3_cmd |= (3 << 3);
+
+	/*
+	 * setup for emrs 1
+	 * BIT[15:13]     --> Always "0"
+	 * BIT[12:12]     --> Qoff: want "Output Buffer Enabled" (0)
+	 * BIT[11:11]     --> TDQS: want "Disabled" (0)
+	 * BIT[10:10]     --> Always "0"
+	 * BIT[09,06,02]  --> Rtt_nom: use rtt_nom_value
+	 * BIT[08]        --> Always "0"
+	 * BIT[07]        --> WR_LVL: want "Disabled" (0)
+	 * BIT[05,01]     --> DIC: use ron_value
+	 * BIT[04:03]     --> AL: additive latency want "0" (0)
+	 * BIT[00]        --> DLL: want "Enable" (0)
+	 *
+	 * (BIT5|BIT1) set Ron value
+	 * 00 --> RZQ/6 (40ohm)
+	 * 01 --> RZQ/7 (34ohm)
+	 * 1* --> RESERVED
+	 *
+	 * (BIT9|BIT6|BIT2) set Rtt_nom value
+	 * 000 --> Disabled
+	 * 001 --> RZQ/4 ( 60ohm)
+	 * 010 --> RZQ/2 (120ohm)
+	 * 011 --> RZQ/6 ( 40ohm)
+	 * 1** --> RESERVED
+	 */
+	emrs1_cmd |= (1 << 3);
+	emrs1_cmd &= ~BIT6;
+
+	if (mrc_params->ron_value == 0)
+		emrs1_cmd |= BIT7;
+	else
+		emrs1_cmd &= ~BIT7;
+
+	if (mrc_params->rtt_nom_value == 0)
+		emrs1_cmd |= (DDR3_EMRS1_RTTNOM_40 << 6);
+	else if (mrc_params->rtt_nom_value == 1)
+		emrs1_cmd |= (DDR3_EMRS1_RTTNOM_60 << 6);
+	else if (mrc_params->rtt_nom_value == 2)
+		emrs1_cmd |= (DDR3_EMRS1_RTTNOM_120 << 6);
+
+	/* save MRS1 value (excluding control fields) */
+	mrc_params->mrs1 = emrs1_cmd >> 6;
+
+	/*
+	 * setup for mrs 0
+	 * BIT[15:13]     --> Always "0"
+	 * BIT[12]        --> PPD: for Quark (1)
+	 * BIT[11:09]     --> WR: use oem_tWR
+	 * BIT[08]        --> DLL: want "Reset" (1, self clearing)
+	 * BIT[07]        --> MODE: want "Normal" (0)
+	 * BIT[06:04,02]  --> CL: use oem_tCAS
+	 * BIT[03]        --> RD_BURST_TYPE: want "Interleave" (1)
+	 * BIT[01:00]     --> BL: want "8 Fixed" (0)
+	 * WR:
+	 * 0 --> 16
+	 * 1 --> 5
+	 * 2 --> 6
+	 * 3 --> 7
+	 * 4 --> 8
+	 * 5 --> 10
+	 * 6 --> 12
+	 * 7 --> 14
+	 * CL:
+	 * BIT[02:02] "0" if oem_tCAS <= 11 (1866?)
+	 * BIT[06:04] use oem_tCAS-4
+	 */
+	mrs0_cmd |= BIT14;
+	mrs0_cmd |= BIT18;
+	mrs0_cmd |= ((((dtr0 >> 12) & 7) + 1) << 10);
+
+	tck = t_ck[mrc_params->ddr_speed];
+	/* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */
+	twr = MCEIL(15000, tck);
+	mrs0_cmd |= ((twr - 4) << 15);
+
+	for (rank = 0; rank < NUM_RANKS; rank++) {
+		/* Skip to next populated rank */
+		if ((mrc_params->rank_enables & (1 << rank)) == 0)
+			continue;
+
+		emrs2_cmd |= (rank << 22);
+		dram_init_command(emrs2_cmd);
+
+		emrs3_cmd |= (rank << 22);
+		dram_init_command(emrs3_cmd);
+
+		emrs1_cmd |= (rank << 22);
+		dram_init_command(emrs1_cmd);
+
+		mrs0_cmd |= (rank << 22);
+		dram_init_command(mrs0_cmd);
+
+		dram_init_command(DCMD_ZQCL(rank));
+	}
+
+	LEAVEFN();
+}
+
+/*
+ * Dunit Initialisation Complete
+ *
+ * Indicates that initialisation of the Dunit has completed.
+ *
+ * Memory accesses are permitted and maintenance operation begins.
+ * Until this bit is set to a 1, the memory controller will not accept
+ * DRAM requests from the MEMORY_MANAGER or HTE.
+ */
+void set_ddr_init_complete(struct mrc_params *mrc_params)
+{
+	u32 dco;
+
+	ENTERFN();
+
+	dco = msg_port_read(MEM_CTLR, DCO);
+	dco &= ~BIT28;
+	dco |= BIT31;
+	msg_port_write(MEM_CTLR, DCO, dco);
+
+	LEAVEFN();
+}
+
+/*
+ * This function will retrieve relevant timing data
+ *
+ * This data will be used on subsequent boots to speed up boot times
+ * and is required for Suspend To RAM capabilities.
+ */
+void restore_timings(struct mrc_params *mrc_params)
+{
+	uint8_t ch, rk, bl;
+	const struct mrc_timings *mt = &mrc_params->timings;
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		for (rk = 0; rk < NUM_RANKS; rk++) {
+			for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
+				set_rcvn(ch, rk, bl, mt->rcvn[ch][rk][bl]);
+				set_rdqs(ch, rk, bl, mt->rdqs[ch][rk][bl]);
+				set_wdqs(ch, rk, bl, mt->wdqs[ch][rk][bl]);
+				set_wdq(ch, rk, bl, mt->wdq[ch][rk][bl]);
+				if (rk == 0) {
+					/* VREF (RANK0 only) */
+					set_vref(ch, bl, mt->vref[ch][bl]);
+				}
+			}
+			set_wctl(ch, rk, mt->wctl[ch][rk]);
+		}
+		set_wcmd(ch, mt->wcmd[ch]);
+	}
+}
+
+/*
+ * Configure default settings normally set as part of read training
+ *
+ * Some defaults have to be set earlier as they may affect earlier
+ * training steps.
+ */
+void default_timings(struct mrc_params *mrc_params)
+{
+	uint8_t ch, rk, bl;
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		for (rk = 0; rk < NUM_RANKS; rk++) {
+			for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
+				set_rdqs(ch, rk, bl, 24);
+				if (rk == 0) {
+					/* VREF (RANK0 only) */
+					set_vref(ch, bl, 32);
+				}
+			}
+		}
+	}
+}
+
+/*
+ * This function will perform our RCVEN Calibration Algorithm.
+ * We will only use the 2xCLK domain timings to perform RCVEN Calibration.
+ * All byte lanes will be calibrated "simultaneously" per channel per rank.
+ */
+void rcvn_cal(struct mrc_params *mrc_params)
+{
+	uint8_t ch;	/* channel counter */
+	uint8_t rk;	/* rank counter */
+	uint8_t bl;	/* byte lane counter */
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+
+#ifdef R2R_SHARING
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
+#ifndef BACKUP_RCVN
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t num_ranks_enabled = 0;
+#endif
+#endif
+
+#ifdef BACKUP_RCVN
+#else
+	uint32_t temp;
+	/* absolute PI value to be programmed on the byte lane */
+	uint32_t delay[NUM_BYTE_LANES];
+	u32 dtr1, dtr1_save;
+#endif
+
+	ENTERFN();
+
+	/* rcvn_cal starts */
+	mrc_post_code(0x05, 0x00);
+
+#ifndef BACKUP_RCVN
+	/* need separate burst to sample DQS preamble */
+	dtr1 = msg_port_read(MEM_CTLR, DTR1);
+	dtr1_save = dtr1;
+	dtr1 |= BIT12;
+	msg_port_write(MEM_CTLR, DTR1, dtr1);
+#endif
+
+#ifdef R2R_SHARING
+	/* need to set "final_delay[][]" elements to "0" */
+	memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
+#endif
+
+	/* loop through each enabled channel */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* perform RCVEN Calibration on a per rank basis */
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					/*
+					 * POST_CODE here indicates the current
+					 * channel and rank being calibrated
+					 */
+					mrc_post_code(0x05, (0x10 + ((ch << 4) | rk)));
+
+#ifdef BACKUP_RCVN
+					/* et hard-coded timing values */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++)
+						set_rcvn(ch, rk, bl, ddr_rcvn[PLATFORM_ID]);
+#else
+					/* enable FIFORST */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) {
+						mrc_alt_write_mask(DDRPHY,
+							(B01PTRCTL1 +
+							((bl >> 1) * DDRIODQ_BL_OFFSET) +
+							(ch * DDRIODQ_CH_OFFSET)),
+							0, BIT8);
+					}
+					/* initialize the starting delay to 128 PI (cas +1 CLK) */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						/* 1x CLK domain timing is cas-4 */
+						delay[bl] = (4 + 1) * FULL_CLK;
+
+						set_rcvn(ch, rk, bl, delay[bl]);
+					}
+
+					/* now find the rising edge */
+					find_rising_edge(mrc_params, delay, ch, rk, true);
+
+					/* Now increase delay by 32 PI (1/4 CLK) to place in center of high pulse */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						delay[bl] += QRTR_CLK;
+						set_rcvn(ch, rk, bl, delay[bl]);
+					}
+					/* Now decrement delay by 128 PI (1 CLK) until we sample a "0" */
+					do {
+						temp = sample_dqs(mrc_params, ch, rk, true);
+						for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+							if (temp & (1 << bl)) {
+								if (delay[bl] >= FULL_CLK) {
+									delay[bl] -= FULL_CLK;
+									set_rcvn(ch, rk, bl, delay[bl]);
+								} else {
+									/* not enough delay */
+									training_message(ch, rk, bl);
+									mrc_post_code(0xEE, 0x50);
+								}
+							}
+						}
+					} while (temp & 0xFF);
+
+#ifdef R2R_SHARING
+					/* increment "num_ranks_enabled" */
+					num_ranks_enabled++;
+					/* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						delay[bl] += QRTR_CLK;
+						/* add "delay[]" values to "final_delay[][]" for rolling average */
+						final_delay[ch][bl] += delay[bl];
+						/* set timing based on rolling average values */
+						set_rcvn(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
+					}
+#else
+					/* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						delay[bl] += QRTR_CLK;
+						set_rcvn(ch, rk, bl, delay[bl]);
+					}
+#endif
+
+					/* disable FIFORST */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) {
+						mrc_alt_write_mask(DDRPHY,
+							(B01PTRCTL1 +
+							((bl >> 1) * DDRIODQ_BL_OFFSET) +
+							(ch * DDRIODQ_CH_OFFSET)),
+							BIT8, BIT8);
+					}
+#endif
+				}
+			}
+		}
+	}
+
+#ifndef BACKUP_RCVN
+	/* restore original */
+	msg_port_write(MEM_CTLR, DTR1, dtr1_save);
+#endif
+
+	LEAVEFN();
+}
+
+/*
+ * This function will perform the Write Levelling algorithm
+ * (align WCLK and WDQS).
+ *
+ * This algorithm will act on each rank in each channel separately.
+ */
+void wr_level(struct mrc_params *mrc_params)
+{
+	uint8_t ch;	/* channel counter */
+	uint8_t rk;	/* rank counter */
+	uint8_t bl;	/* byte lane counter */
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+
+#ifdef R2R_SHARING
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
+#ifndef BACKUP_WDQS
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t num_ranks_enabled = 0;
+#endif
+#endif
+
+#ifdef BACKUP_WDQS
+#else
+	/* determines stop condition for CRS_WR_LVL */
+	bool all_edges_found;
+	/* absolute PI value to be programmed on the byte lane */
+	uint32_t delay[NUM_BYTE_LANES];
+	/*
+	 * static makes it so the data is loaded in the heap once by shadow(),
+	 * where non-static copies the data onto the stack every time this
+	 * function is called
+	 */
+	uint32_t address;	/* address to be checked during COARSE_WR_LVL */
+	u32 dtr4, dtr4_save;
+#endif
+
+	ENTERFN();
+
+	/* wr_level starts */
+	mrc_post_code(0x06, 0x00);
+
+#ifdef R2R_SHARING
+	/* need to set "final_delay[][]" elements to "0" */
+	memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
+#endif
+
+	/* loop through each enabled channel */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			/* perform WRITE LEVELING algorithm on a per rank basis */
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					/*
+					 * POST_CODE here indicates the current
+					 * rank and channel being calibrated
+					 */
+					mrc_post_code(0x06, (0x10 + ((ch << 4) | rk)));
+
+#ifdef BACKUP_WDQS
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						set_wdqs(ch, rk, bl, ddr_wdqs[PLATFORM_ID]);
+						set_wdq(ch, rk, bl, (ddr_wdqs[PLATFORM_ID] - QRTR_CLK));
+					}
+#else
+					/*
+					 * perform a single PRECHARGE_ALL command to
+					 * make DRAM state machine go to IDLE state
+					 */
+					dram_init_command(DCMD_PREA(rk));
+
+					/*
+					 * enable Write Levelling Mode
+					 * (EMRS1 w/ Write Levelling Mode Enable)
+					 */
+					dram_init_command(DCMD_MRS1(rk, 0x0082));
+
+					/*
+					 * set ODT DRAM Full Time Termination
+					 * disable in MCU
+					 */
+
+					dtr4 = msg_port_read(MEM_CTLR, DTR4);
+					dtr4_save = dtr4;
+					dtr4 |= BIT15;
+					msg_port_write(MEM_CTLR, DTR4, dtr4);
+
+					for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
+						/*
+						 * Enable Sandy Bridge Mode (WDQ Tri-State) &
+						 * Ensure 5 WDQS pulses during Write Leveling
+						 */
+						mrc_alt_write_mask(DDRPHY,
+							DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch),
+							(BIT28 | BIT8 | BIT6 | BIT4 | BIT2),
+							(BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
+					}
+
+					/* Write Leveling Mode enabled in IO */
+					mrc_alt_write_mask(DDRPHY,
+						CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch),
+						BIT16, BIT16);
+
+					/* Initialize the starting delay to WCLK */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						/*
+						 * CLK0 --> RK0
+						 * CLK1 --> RK1
+						 */
+						delay[bl] = get_wclk(ch, rk);
+
+						set_wdqs(ch, rk, bl, delay[bl]);
+					}
+
+					/* now find the rising edge */
+					find_rising_edge(mrc_params, delay, ch, rk, false);
+
+					/* disable Write Levelling Mode */
+					mrc_alt_write_mask(DDRPHY,
+						CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch),
+						0, BIT16);
+
+					for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
+						/* Disable Sandy Bridge Mode & Ensure 4 WDQS pulses during normal operation */
+						mrc_alt_write_mask(DDRPHY,
+							DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch),
+							(BIT8 | BIT6 | BIT4 | BIT2),
+							(BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
+					}
+
+					/* restore original DTR4 */
+					msg_port_write(MEM_CTLR, DTR4, dtr4_save);
+
+					/*
+					 * restore original value
+					 * (Write Levelling Mode Disable)
+					 */
+					dram_init_command(DCMD_MRS1(rk, mrc_params->mrs1));
+
+					/*
+					 * perform a single PRECHARGE_ALL command to
+					 * make DRAM state machine go to IDLE state
+					 */
+					dram_init_command(DCMD_PREA(rk));
+
+					mrc_post_code(0x06, (0x30 + ((ch << 4) | rk)));
+
+					/*
+					 * COARSE WRITE LEVEL:
+					 * check that we're on the correct clock edge
+					 */
+
+					/* hte reconfiguration request */
+					mrc_params->hte_setup = 1;
+
+					/* start CRS_WR_LVL with WDQS = WDQS + 128 PI */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						delay[bl] = get_wdqs(ch, rk, bl) + FULL_CLK;
+						set_wdqs(ch, rk, bl, delay[bl]);
+						/*
+						 * program WDQ timings based on WDQS
+						 * (WDQ = WDQS - 32 PI)
+						 */
+						set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK));
+					}
+
+					/* get an address in the targeted channel/rank */
+					address = get_addr(ch, rk);
+					do {
+						uint32_t coarse_result = 0x00;
+						uint32_t coarse_result_mask = byte_lane_mask(mrc_params);
+						/* assume pass */
+						all_edges_found = true;
+
+						mrc_params->hte_setup = 1;
+						coarse_result = check_rw_coarse(mrc_params, address);
+
+						/* check for failures and margin the byte lane back 128 PI (1 CLK) */
+						for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+							if (coarse_result & (coarse_result_mask << bl)) {
+								all_edges_found = false;
+								delay[bl] -= FULL_CLK;
+								set_wdqs(ch, rk, bl, delay[bl]);
+								/* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */
+								set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK));
+							}
+						}
+					} while (!all_edges_found);
+
+#ifdef R2R_SHARING
+					/* increment "num_ranks_enabled" */
+					 num_ranks_enabled++;
+					/* accumulate "final_delay[][]" values from "delay[]" values for rolling average */
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						final_delay[ch][bl] += delay[bl];
+						set_wdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
+						/* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */
+						set_wdq(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled) - QRTR_CLK);
+					}
+#endif
+#endif
+				}
+			}
+		}
+	}
+
+	LEAVEFN();
+}
+
+void prog_page_ctrl(struct mrc_params *mrc_params)
+{
+	u32 dpmc0;
+
+	ENTERFN();
+
+	dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
+	dpmc0 &= ~(BIT16 | BIT17 | BIT18);
+	dpmc0 |= (4 << 16);
+	dpmc0 |= BIT21;
+	msg_port_write(MEM_CTLR, DPMC0, dpmc0);
+}
+
+/*
+ * This function will perform the READ TRAINING Algorithm on all
+ * channels/ranks/byte_lanes simultaneously to minimize execution time.
+ *
+ * The idea here is to train the VREF and RDQS (and eventually RDQ) values
+ * to achieve maximum READ margins. The algorithm will first determine the
+ * X coordinate (RDQS setting). This is done by collapsing the VREF eye
+ * until we find a minimum required RDQS eye for VREF_MIN and VREF_MAX.
+ * Then we take the averages of the RDQS eye@VREF_MIN and VREF_MAX,
+ * then average those; this will be the final X coordinate. The algorithm
+ * will then determine the Y coordinate (VREF setting). This is done by
+ * collapsing the RDQS eye until we find a minimum required VREF eye for
+ * RDQS_MIN and RDQS_MAX. Then we take the averages of the VREF eye at
+ * RDQS_MIN and RDQS_MAX, then average those; this will be the final Y
+ * coordinate.
+ *
+ * NOTE: this algorithm assumes the eye curves have a one-to-one relationship,
+ * meaning for each X the curve has only one Y and vice-a-versa.
+ */
+void rd_train(struct mrc_params *mrc_params)
+{
+	uint8_t ch;	/* channel counter */
+	uint8_t rk;	/* rank counter */
+	uint8_t bl;	/* byte lane counter */
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+#ifdef BACKUP_RDQS
+#else
+	uint8_t side_x;	/* tracks LEFT/RIGHT approach vectors */
+	uint8_t side_y;	/* tracks BOTTOM/TOP approach vectors */
+	/* X coordinate data (passing RDQS values) for approach vectors */
+	uint8_t x_coordinate[2][2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	/* Y coordinate data (passing VREF values) for approach vectors */
+	uint8_t y_coordinate[2][2][NUM_CHANNELS][NUM_BYTE_LANES];
+	/* centered X (RDQS) */
+	uint8_t x_center[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	/* centered Y (VREF) */
+	uint8_t y_center[NUM_CHANNELS][NUM_BYTE_LANES];
+	uint32_t address;	/* target address for check_bls_ex() */
+	uint32_t result;	/* result of check_bls_ex() */
+	uint32_t bl_mask;	/* byte lane mask for result checking */
+#ifdef R2R_SHARING
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t num_ranks_enabled = 0;
+#endif
+#endif
+
+	/* rd_train starts */
+	mrc_post_code(0x07, 0x00);
+
+	ENTERFN();
+
+#ifdef BACKUP_RDQS
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					for (bl = 0;
+					     bl < (NUM_BYTE_LANES / bl_divisor);
+					     bl++) {
+						set_rdqs(ch, rk, bl, ddr_rdqs[PLATFORM_ID]);
+					}
+				}
+			}
+		}
+	}
+#else
+	/* initialise x/y_coordinate arrays */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					for (bl = 0;
+					     bl < (NUM_BYTE_LANES / bl_divisor);
+					     bl++) {
+						/* x_coordinate */
+						x_coordinate[L][B][ch][rk][bl] = RDQS_MIN;
+						x_coordinate[R][B][ch][rk][bl] = RDQS_MAX;
+						x_coordinate[L][T][ch][rk][bl] = RDQS_MIN;
+						x_coordinate[R][T][ch][rk][bl] = RDQS_MAX;
+						/* y_coordinate */
+						y_coordinate[L][B][ch][bl] = VREF_MIN;
+						y_coordinate[R][B][ch][bl] = VREF_MIN;
+						y_coordinate[L][T][ch][bl] = VREF_MAX;
+						y_coordinate[R][T][ch][bl] = VREF_MAX;
+					}
+				}
+			}
+		}
+	}
+
+	/* initialize other variables */
+	bl_mask = byte_lane_mask(mrc_params);
+	address = get_addr(0, 0);
+
+#ifdef R2R_SHARING
+	/* need to set "final_delay[][]" elements to "0" */
+	memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
+#endif
+
+	/* look for passing coordinates */
+	for (side_y = B; side_y <= T; side_y++) {
+		for (side_x = L; side_x <= R; side_x++) {
+			mrc_post_code(0x07, (0x10 + (side_y * 2) + (side_x)));
+
+			/* find passing values */
+			for (ch = 0; ch < NUM_CHANNELS; ch++) {
+				if (mrc_params->channel_enables & (0x1 << ch)) {
+					for (rk = 0; rk < NUM_RANKS; rk++) {
+						if (mrc_params->rank_enables &
+							(0x1 << rk)) {
+							/* set x/y_coordinate search starting settings */
+							for (bl = 0;
+							     bl < (NUM_BYTE_LANES / bl_divisor);
+							     bl++) {
+								set_rdqs(ch, rk, bl,
+									 x_coordinate[side_x][side_y][ch][rk][bl]);
+								set_vref(ch, bl,
+									 y_coordinate[side_x][side_y][ch][bl]);
+							}
+
+							/* get an address in the target channel/rank */
+							address = get_addr(ch, rk);
+
+							/* request HTE reconfiguration */
+							mrc_params->hte_setup = 1;
+
+							/* test the settings */
+							do {
+								/* result[07:00] == failing byte lane (MAX 8) */
+								result = check_bls_ex(mrc_params, address);
+
+								/* check for failures */
+								if (result & 0xFF) {
+									/* at least 1 byte lane failed */
+									for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+										if (result &
+											(bl_mask << bl)) {
+											/* adjust the RDQS values accordingly */
+											if (side_x == L)
+												x_coordinate[L][side_y][ch][rk][bl] += RDQS_STEP;
+											else
+												x_coordinate[R][side_y][ch][rk][bl] -= RDQS_STEP;
+
+											/* check that we haven't closed the RDQS_EYE too much */
+											if ((x_coordinate[L][side_y][ch][rk][bl] > (RDQS_MAX - MIN_RDQS_EYE)) ||
+												(x_coordinate[R][side_y][ch][rk][bl] < (RDQS_MIN + MIN_RDQS_EYE)) ||
+												(x_coordinate[L][side_y][ch][rk][bl] ==
+												x_coordinate[R][side_y][ch][rk][bl])) {
+												/*
+												 * not enough RDQS margin available@this VREF
+												 * update VREF values accordingly
+												 */
+												if (side_y == B)
+													y_coordinate[side_x][B][ch][bl] += VREF_STEP;
+												else
+													y_coordinate[side_x][T][ch][bl] -= VREF_STEP;
+
+												/* check that we haven't closed the VREF_EYE too much */
+												if ((y_coordinate[side_x][B][ch][bl] > (VREF_MAX - MIN_VREF_EYE)) ||
+													(y_coordinate[side_x][T][ch][bl] < (VREF_MIN + MIN_VREF_EYE)) ||
+													(y_coordinate[side_x][B][ch][bl] == y_coordinate[side_x][T][ch][bl])) {
+													/* VREF_EYE collapsed below MIN_VREF_EYE */
+													training_message(ch, rk, bl);
+													mrc_post_code(0xEE, (0x70 + (side_y * 2) + (side_x)));
+												} else {
+													/* update the VREF setting */
+													set_vref(ch, bl, y_coordinate[side_x][side_y][ch][bl]);
+													/* reset the X coordinate to begin the search@the new VREF */
+													x_coordinate[side_x][side_y][ch][rk][bl] =
+														(side_x == L) ? (RDQS_MIN) : (RDQS_MAX);
+												}
+											}
+
+											/* update the RDQS setting */
+											set_rdqs(ch, rk, bl, x_coordinate[side_x][side_y][ch][rk][bl]);
+										}
+									}
+								}
+							} while (result & 0xFF);
+						}
+					}
+				}
+			}
+		}
+	}
+
+	mrc_post_code(0x07, 0x20);
+
+	/* find final RDQS (X coordinate) & final VREF (Y coordinate) */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						uint32_t temp1;
+						uint32_t temp2;
+
+						/* x_coordinate */
+						DPF(D_INFO,
+						    "RDQS T/B eye rank%d lane%d : %d-%d %d-%d\n",
+						    rk, bl,
+						    x_coordinate[L][T][ch][rk][bl],
+						    x_coordinate[R][T][ch][rk][bl],
+						    x_coordinate[L][B][ch][rk][bl],
+						    x_coordinate[R][B][ch][rk][bl]);
+
+						/* average the TOP side LEFT & RIGHT values */
+						temp1 = (x_coordinate[R][T][ch][rk][bl] + x_coordinate[L][T][ch][rk][bl]) / 2;
+						/* average the BOTTOM side LEFT & RIGHT values */
+						temp2 = (x_coordinate[R][B][ch][rk][bl] + x_coordinate[L][B][ch][rk][bl]) / 2;
+						/* average the above averages */
+						x_center[ch][rk][bl] = (uint8_t) ((temp1 + temp2) / 2);
+
+						/* y_coordinate */
+						DPF(D_INFO,
+						    "VREF R/L eye lane%d : %d-%d %d-%d\n",
+						    bl,
+						    y_coordinate[R][B][ch][bl],
+						    y_coordinate[R][T][ch][bl],
+						    y_coordinate[L][B][ch][bl],
+						    y_coordinate[L][T][ch][bl]);
+
+						/* average the RIGHT side TOP & BOTTOM values */
+						temp1 = (y_coordinate[R][T][ch][bl] + y_coordinate[R][B][ch][bl]) / 2;
+						/* average the LEFT side TOP & BOTTOM values */
+						temp2 = (y_coordinate[L][T][ch][bl] + y_coordinate[L][B][ch][bl]) / 2;
+						/* average the above averages */
+						y_center[ch][bl] = (uint8_t) ((temp1 + temp2) / 2);
+					}
+				}
+			}
+		}
+	}
+
+#ifdef RX_EYE_CHECK
+	/* perform an eye check */
+	for (side_y = B; side_y <= T; side_y++) {
+		for (side_x = L; side_x <= R; side_x++) {
+			mrc_post_code(0x07, (0x30 + (side_y * 2) + (side_x)));
+
+			/* update the settings for the eye check */
+			for (ch = 0; ch < NUM_CHANNELS; ch++) {
+				if (mrc_params->channel_enables & (1 << ch)) {
+					for (rk = 0; rk < NUM_RANKS; rk++) {
+						if (mrc_params->rank_enables & (1 << rk)) {
+							for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+								if (side_x == L)
+									set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] - (MIN_RDQS_EYE / 2)));
+								else
+									set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] + (MIN_RDQS_EYE / 2)));
+
+								if (side_y == B)
+									set_vref(ch, bl, (y_center[ch][bl] - (MIN_VREF_EYE / 2)));
+								else
+									set_vref(ch, bl, (y_center[ch][bl] + (MIN_VREF_EYE / 2)));
+							}
+						}
+					}
+				}
+			}
+
+			/* request HTE reconfiguration */
+			mrc_params->hte_setup = 1;
+
+			/* check the eye */
+			if (check_bls_ex(mrc_params, address) & 0xFF) {
+				/* one or more byte lanes failed */
+				mrc_post_code(0xEE, (0x74 + (side_x * 2) + (side_y)));
+			}
+		}
+	}
+#endif
+
+	mrc_post_code(0x07, 0x40);
+
+	/* set final placements */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+#ifdef R2R_SHARING
+					/* increment "num_ranks_enabled" */
+					num_ranks_enabled++;
+#endif
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						/* x_coordinate */
+#ifdef R2R_SHARING
+						final_delay[ch][bl] += x_center[ch][rk][bl];
+						set_rdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
+#else
+						set_rdqs(ch, rk, bl, x_center[ch][rk][bl]);
+#endif
+						/* y_coordinate */
+						set_vref(ch, bl, y_center[ch][bl]);
+					}
+				}
+			}
+		}
+	}
+#endif
+
+	LEAVEFN();
+}
+
+/*
+ * This function will perform the WRITE TRAINING Algorithm on all
+ * channels/ranks/byte_lanes simultaneously to minimize execution time.
+ *
+ * The idea here is to train the WDQ timings to achieve maximum WRITE margins.
+ * The algorithm will start with WDQ@the current WDQ setting (tracks WDQS
+ * in WR_LVL) +/- 32 PIs (+/- 1/4 CLK) and collapse the eye until all data
+ * patterns pass. This is because WDQS will be aligned to WCLK by the
+ * Write Leveling algorithm and WDQ will only ever have a 1/2 CLK window
+ * of validity.
+ */
+void wr_train(struct mrc_params *mrc_params)
+{
+	uint8_t ch;	/* channel counter */
+	uint8_t rk;	/* rank counter */
+	uint8_t bl;	/* byte lane counter */
+	uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
+#ifdef BACKUP_WDQ
+#else
+	uint8_t side;		/* LEFT/RIGHT side indicator (0=L, 1=R) */
+	uint32_t temp;		/* temporary DWORD */
+	/* 2 arrays, for L & R side passing delays */
+	uint32_t delay[2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
+	uint32_t address;	/* target address for check_bls_ex() */
+	uint32_t result;	/* result of check_bls_ex() */
+	uint32_t bl_mask;	/* byte lane mask for result checking */
+#ifdef R2R_SHARING
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
+	/* used to find placement for rank2rank sharing configs */
+	uint32_t num_ranks_enabled = 0;
+#endif
+#endif
+
+	/* wr_train starts */
+	mrc_post_code(0x08, 0x00);
+
+	ENTERFN();
+
+#ifdef BACKUP_WDQ
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					for (bl = 0;
+					     bl < (NUM_BYTE_LANES / bl_divisor);
+					     bl++) {
+						set_wdq(ch, rk, bl, ddr_wdq[PLATFORM_ID]);
+					}
+				}
+			}
+		}
+	}
+#else
+	/* initialise "delay" */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+					for (bl = 0;
+					     bl < (NUM_BYTE_LANES / bl_divisor);
+					     bl++) {
+						/*
+						 * want to start with
+						 * WDQ = (WDQS - QRTR_CLK)
+						 * +/- QRTR_CLK
+						 */
+						temp = get_wdqs(ch, rk, bl) - QRTR_CLK;
+						delay[L][ch][rk][bl] = temp - QRTR_CLK;
+						delay[R][ch][rk][bl] = temp + QRTR_CLK;
+					}
+				}
+			}
+		}
+	}
+
+	/* initialise other variables */
+	bl_mask = byte_lane_mask(mrc_params);
+	address = get_addr(0, 0);
+
+#ifdef R2R_SHARING
+	/* need to set "final_delay[][]" elements to "0" */
+	memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
+#endif
+
+	/*
+	 * start algorithm on the LEFT side and train each channel/bl
+	 * until no failures are observed, then repeat for the RIGHT side.
+	 */
+	for (side = L; side <= R; side++) {
+		mrc_post_code(0x08, (0x10 + (side)));
+
+		/* set starting values */
+		for (ch = 0; ch < NUM_CHANNELS; ch++) {
+			if (mrc_params->channel_enables & (1 << ch)) {
+				for (rk = 0; rk < NUM_RANKS; rk++) {
+					if (mrc_params->rank_enables &
+						(1 << rk)) {
+						for (bl = 0;
+						     bl < (NUM_BYTE_LANES / bl_divisor);
+						     bl++) {
+							set_wdq(ch, rk, bl, delay[side][ch][rk][bl]);
+						}
+					}
+				}
+			}
+		}
+
+		/* find passing values */
+		for (ch = 0; ch < NUM_CHANNELS; ch++) {
+			if (mrc_params->channel_enables & (1 << ch)) {
+				for (rk = 0; rk < NUM_RANKS; rk++) {
+					if (mrc_params->rank_enables &
+						(1 << rk)) {
+						/* get an address in the target channel/rank */
+						address = get_addr(ch, rk);
+
+						/* request HTE reconfiguration */
+						mrc_params->hte_setup = 1;
+
+						/* check the settings */
+						do {
+							/* result[07:00] == failing byte lane (MAX 8) */
+							result = check_bls_ex(mrc_params, address);
+							/* check for failures */
+							if (result & 0xFF) {
+								/* at least 1 byte lane failed */
+								for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+									if (result &
+										(bl_mask << bl)) {
+										if (side == L)
+											delay[L][ch][rk][bl] += WDQ_STEP;
+										else
+											delay[R][ch][rk][bl] -= WDQ_STEP;
+
+										/* check for algorithm failure */
+										if (delay[L][ch][rk][bl] != delay[R][ch][rk][bl]) {
+											/*
+											 * margin available
+											 * update delay setting
+											 */
+											set_wdq(ch, rk, bl,
+												delay[side][ch][rk][bl]);
+										} else {
+											/*
+											 * no margin available
+											 * notify the user and halt
+											 */
+											training_message(ch, rk, bl);
+											mrc_post_code(0xEE, (0x80 + side));
+										}
+									}
+								}
+							}
+						/* stop when all byte lanes pass */
+						} while (result & 0xFF);
+					}
+				}
+			}
+		}
+	}
+
+	/* program WDQ to the middle of passing window */
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		if (mrc_params->channel_enables & (1 << ch)) {
+			for (rk = 0; rk < NUM_RANKS; rk++) {
+				if (mrc_params->rank_enables & (1 << rk)) {
+#ifdef R2R_SHARING
+					/* increment "num_ranks_enabled" */
+					num_ranks_enabled++;
+#endif
+					for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
+						DPF(D_INFO,
+						    "WDQ eye rank%d lane%d : %d-%d\n",
+						    rk, bl,
+						    delay[L][ch][rk][bl],
+						    delay[R][ch][rk][bl]);
+
+						temp = (delay[R][ch][rk][bl] + delay[L][ch][rk][bl]) / 2;
+
+#ifdef R2R_SHARING
+						final_delay[ch][bl] += temp;
+						set_wdq(ch, rk, bl,
+							((final_delay[ch][bl]) / num_ranks_enabled));
+#else
+						set_wdq(ch, rk, bl, temp);
+#endif
+					}
+				}
+			}
+		}
+	}
+#endif
+
+	LEAVEFN();
+}
+
+/*
+ * This function will store relevant timing data
+ *
+ * This data will be used on subsequent boots to speed up boot times
+ * and is required for Suspend To RAM capabilities.
+ */
+void store_timings(struct mrc_params *mrc_params)
+{
+	uint8_t ch, rk, bl;
+	struct mrc_timings *mt = &mrc_params->timings;
+
+	for (ch = 0; ch < NUM_CHANNELS; ch++) {
+		for (rk = 0; rk < NUM_RANKS; rk++) {
+			for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
+				mt->rcvn[ch][rk][bl] = get_rcvn(ch, rk, bl);
+				mt->rdqs[ch][rk][bl] = get_rdqs(ch, rk, bl);
+				mt->wdqs[ch][rk][bl] = get_wdqs(ch, rk, bl);
+				mt->wdq[ch][rk][bl] = get_wdq(ch, rk, bl);
+
+				if (rk == 0)
+					mt->vref[ch][bl] = get_vref(ch, bl);
+			}
+
+			mt->wctl[ch][rk] = get_wctl(ch, rk);
+		}
+
+		mt->wcmd[ch] = get_wcmd(ch);
+	}
+
+	/* need to save for a case of changing frequency after warm reset */
+	mt->ddr_speed = mrc_params->ddr_speed;
+}
+
+/*
+ * The purpose of this function is to ensure the SEC comes out of reset
+ * and IA initiates the SEC enabling Memory Scrambling.
+ */
+void enable_scrambling(struct mrc_params *mrc_params)
+{
+	uint32_t lfsr = 0;
+	uint8_t i;
+
+	if (mrc_params->scrambling_enables == 0)
+		return;
+
+	ENTERFN();
+
+	/* 32 bit seed is always stored in BIOS NVM */
+	lfsr = mrc_params->timings.scrambler_seed;
+
+	if (mrc_params->boot_mode == BM_COLD) {
+		/*
+		 * factory value is 0 and in first boot,
+		 * a clock based seed is loaded.
+		 */
+		if (lfsr == 0) {
+			/*
+			 * get seed from system clock
+			 * and make sure it is not all 1's
+			 */
+			lfsr = rdtsc() & 0x0FFFFFFF;
+		} else {
+			/*
+			 * Need to replace scrambler
+			 *
+			 * get next 32bit LFSR 16 times which is the last
+			 * part of the previous scrambler vector
+			 */
+			for (i = 0; i < 16; i++)
+				lfsr32(&lfsr);
+		}
+
+		/* save new seed */
+		mrc_params->timings.scrambler_seed = lfsr;
+	}
+
+	/*
+	 * In warm boot or S3 exit, we have the previous seed.
+	 * In cold boot, we have the last 32bit LFSR which is the new seed.
+	 */
+	lfsr32(&lfsr);	/* shift to next value */
+	msg_port_write(MEM_CTLR, SCRMSEED, (lfsr & 0x0003FFFF));
+
+	for (i = 0; i < 2; i++)
+		msg_port_write(MEM_CTLR, SCRMLO + i, (lfsr & 0xAAAAAAAA));
+
+	LEAVEFN();
+}
+
+/*
+ * Configure MCU Power Management Control Register
+ * and Scheduler Control Register
+ */
+void prog_ddr_control(struct mrc_params *mrc_params)
+{
+	u32 dsch;
+	u32 dpmc0;
+
+	ENTERFN();
+
+	dsch = msg_port_read(MEM_CTLR, DSCH);
+	dsch &= ~(BIT8 | BIT9 | BIT12);
+	msg_port_write(MEM_CTLR, DSCH, dsch);
+
+	dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
+	dpmc0 &= ~BIT25;
+	dpmc0 |= (mrc_params->power_down_disable << 25);
+	dpmc0 &= ~BIT24;
+	dpmc0 &= ~(BIT16 | BIT17 | BIT18);
+	dpmc0 |= (4 << 16);
+	dpmc0 |= BIT21;
+	msg_port_write(MEM_CTLR, DPMC0, dpmc0);
+
+	/* CMDTRIST = 2h - CMD/ADDR are tristated when no valid command */
+	mrc_write_mask(MEM_CTLR, DPMC1, 2 << 4, BIT4 | BIT5);
+
+	LEAVEFN();
+}
+
+/*
+ * After training complete configure MCU Rank Population Register
+ * specifying: ranks enabled, device width, density, address mode
+ */
+void prog_dra_drb(struct mrc_params *mrc_params)
+{
+	u32 drp;
+	u32 dco;
+	u8 density = mrc_params->params.density;
+
+	ENTERFN();
+
+	dco = msg_port_read(MEM_CTLR, DCO);
+	dco &= ~BIT31;
+	msg_port_write(MEM_CTLR, DCO, dco);
+
+	drp = 0;
+	if (mrc_params->rank_enables & 1)
+		drp |= BIT0;
+	if (mrc_params->rank_enables & 2)
+		drp |= BIT1;
+	if (mrc_params->dram_width == X16) {
+		drp |= (1 << 4);
+		drp |= (1 << 9);
+	}
+
+	/*
+	 * Density encoding in struct dram_params: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb
+	 * has to be mapped RANKDENSx encoding (0=1Gb)
+	 */
+	if (density == 0)
+		density = 4;
+
+	drp |= ((density - 1) << 6);
+	drp |= ((density - 1) << 11);
+
+	/* Address mode can be overwritten if ECC enabled */
+	drp |= (mrc_params->address_mode << 14);
+
+	msg_port_write(MEM_CTLR, DRP, drp);
+
+	dco &= ~BIT28;
+	dco |= BIT31;
+	msg_port_write(MEM_CTLR, DCO, dco);
+
+	LEAVEFN();
+}
+
+/* Send DRAM wake command */
+void perform_wake(struct mrc_params *mrc_params)
+{
+	ENTERFN();
+
+	dram_wake_command();
+
+	LEAVEFN();
+}
+
+/*
+ * Configure refresh rate and short ZQ calibration interval
+ * Activate dynamic self refresh
+ */
+void change_refresh_period(struct mrc_params *mrc_params)
+{
+	u32 drfc;
+	u32 dcal;
+	u32 dpmc0;
+
+	ENTERFN();
+
+	drfc = msg_port_read(MEM_CTLR, DRFC);
+	drfc &= ~(BIT12 | BIT13 | BIT14);
+	drfc |= (mrc_params->refresh_rate << 12);
+	drfc |= BIT21;
+	msg_port_write(MEM_CTLR, DRFC, drfc);
+
+	dcal = msg_port_read(MEM_CTLR, DCAL);
+	dcal &= ~(BIT8 | BIT9 | BIT10);
+	dcal |= (3 << 8);	/* 63ms */
+	msg_port_write(MEM_CTLR, DCAL, dcal);
+
+	dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
+	dpmc0 |= (BIT23 | BIT29);
+	msg_port_write(MEM_CTLR, DPMC0, dpmc0);
+
+	LEAVEFN();
+}
+
+/*
+ * Configure DDRPHY for Auto-Refresh, Periodic Compensations,
+ * Dynamic Diff-Amp, ZQSPERIOD, Auto-Precharge, CKE Power-Down
+ */
+void set_auto_refresh(struct mrc_params *mrc_params)
+{
+	uint32_t channel;
+	uint32_t rank;
+	uint32_t bl;
+	uint32_t bl_divisor = 1;
+	uint32_t temp;
+
+	ENTERFN();
+
+	/*
+	 * Enable Auto-Refresh, Periodic Compensations, Dynamic Diff-Amp,
+	 * ZQSPERIOD, Auto-Precharge, CKE Power-Down
+	 */
+	for (channel = 0; channel < NUM_CHANNELS; channel++) {
+		if (mrc_params->channel_enables & (1 << channel)) {
+			/* Enable Periodic RCOMPS */
+			mrc_alt_write_mask(DDRPHY, CMPCTRL, BIT1, BIT1);
+
+			/* Enable Dynamic DiffAmp & Set Read ODT Value */
+			switch (mrc_params->rd_odt_value) {
+			case 0:
+				temp = 0x3F;	/* OFF */
+				break;
+			default:
+				temp = 0x00;	/* Auto */
+				break;
+			}
+
+			for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
+				/* Override: DIFFAMP, ODT */
+				mrc_alt_write_mask(DDRPHY,
+					(B0OVRCTL + (bl * DDRIODQ_BL_OFFSET) +
+					(channel * DDRIODQ_CH_OFFSET)),
+					(0x00 << 16) | (temp << 10),
+					(BIT21 | BIT20 | BIT19 | BIT18 |
+					 BIT17 | BIT16 | BIT15 | BIT14 |
+					 BIT13 | BIT12 | BIT11 | BIT10));
+
+				/* Override: DIFFAMP, ODT */
+				mrc_alt_write_mask(DDRPHY,
+					(B1OVRCTL + (bl * DDRIODQ_BL_OFFSET) +
+					(channel * DDRIODQ_CH_OFFSET)),
+					(0x00 << 16) | (temp << 10),
+					(BIT21 | BIT20 | BIT19 | BIT18 |
+					 BIT17 | BIT16 | BIT15 | BIT14 |
+					 BIT13 | BIT12 | BIT11 | BIT10));
+			}
+
+			/* Issue ZQCS command */
+			for (rank = 0; rank < NUM_RANKS; rank++) {
+				if (mrc_params->rank_enables & (1 << rank))
+					dram_init_command(DCMD_ZQCS(rank));
+			}
+		}
+	}
+
+	clear_pointers();
+
+	LEAVEFN();
+}
+
+/*
+ * Depending on configuration enables ECC support
+ *
+ * Available memory size is decreased, and updated with 0s
+ * in order to clear error status. Address mode 2 forced.
+ */
+void ecc_enable(struct mrc_params *mrc_params)
+{
+	u32 drp;
+	u32 dsch;
+	u32 ecc_ctrl;
+
+	if (mrc_params->ecc_enables == 0)
+		return;
+
+	ENTERFN();
+
+	/* Configuration required in ECC mode */
+	drp = msg_port_read(MEM_CTLR, DRP);
+	drp &= ~(BIT14 | BIT15);
+	drp |= BIT15;
+	drp |= BIT13;
+	msg_port_write(MEM_CTLR, DRP, drp);
+
+	/* Disable new request bypass */
+	dsch = msg_port_read(MEM_CTLR, DSCH);
+	dsch |= BIT12;
+	msg_port_write(MEM_CTLR, DSCH, dsch);
+
+	/* Enable ECC */
+	ecc_ctrl = (BIT0 | BIT1 | BIT17);
+	msg_port_write(MEM_CTLR, DECCCTRL, ecc_ctrl);
+
+	/* Assume 8 bank memory, one bank is gone for ECC */
+	mrc_params->mem_size -= mrc_params->mem_size / 8;
+
+	/* For S3 resume memory content has to be preserved */
+	if (mrc_params->boot_mode != BM_S3) {
+		select_hte();
+		hte_mem_init(mrc_params, MRC_MEM_INIT);
+		select_mem_mgr();
+	}
+
+	LEAVEFN();
+}
+
+/*
+ * Execute memory test
+ * if error detected it is indicated in mrc_params->status
+ */
+void memory_test(struct mrc_params *mrc_params)
+{
+	uint32_t result = 0;
+
+	ENTERFN();
+
+	select_hte();
+	result = hte_mem_init(mrc_params, MRC_MEM_TEST);
+	select_mem_mgr();
+
+	DPF(D_INFO, "Memory test result %x\n", result);
+	mrc_params->status = ((result == 0) ? MRC_SUCCESS : MRC_E_MEMTEST);
+	LEAVEFN();
+}
+
+/* Lock MCU registers at the end of initialization sequence */
+void lock_registers(struct mrc_params *mrc_params)
+{
+	u32 dco;
+
+	ENTERFN();
+
+	dco = msg_port_read(MEM_CTLR, DCO);
+	dco &= ~(BIT28 | BIT29);
+	dco |= (BIT0 | BIT8);
+	msg_port_write(MEM_CTLR, DCO, dco);
+
+	LEAVEFN();
+}
diff --git a/arch/x86/cpu/quark/smc.h b/arch/x86/cpu/quark/smc.h
new file mode 100644
index 0000000..f774cb3
--- /dev/null
+++ b/arch/x86/cpu/quark/smc.h
@@ -0,0 +1,446 @@
+/*
+ * Copyright (C) 2013, Intel Corporation
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * Ported from Intel released Quark UEFI BIOS
+ * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
+ *
+ * SPDX-License-Identifier:	Intel
+ */
+
+#ifndef _SMC_H_
+#define _SMC_H_
+
+/* System Memory Controller Register Defines */
+
+/* Memory Controller Message Bus Registers Offsets */
+#define DRP			0x00
+#define DTR0			0x01
+#define DTR1			0x02
+#define DTR2			0x03
+#define DTR3			0x04
+#define DTR4			0x05
+#define DPMC0			0x06
+#define DPMC1			0x07
+#define DRFC			0x08
+#define DSCH			0x09
+#define DCAL			0x0A
+#define DRMC			0x0B
+#define PMSTS			0x0C
+#define DCO			0x0F
+#define DSTAT			0x20
+#define SSKPD0			0x4A
+#define SSKPD1			0x4B
+#define DECCCTRL		0x60
+#define DECCSTAT		0x61
+#define DECCSBECNT		0x62
+#define DECCSBECA		0x68
+#define DECCSBECS		0x69
+#define DECCDBECA		0x6A
+#define DECCDBECS		0x6B
+#define DFUSESTAT		0x70
+#define SCRMSEED		0x80
+#define SCRMLO			0x81
+#define SCRMHI			0x82
+
+/* DRAM init command */
+#define DCMD_MRS1(rnk, dat)	(0 | ((rnk) << 22) | (1 << 3) | ((dat) << 6))
+#define DCMD_REF(rnk)		(1 | ((rnk) << 22))
+#define DCMD_PRE(rnk)		(2 | ((rnk) << 22))
+#define DCMD_PREA(rnk)		(2 | ((rnk) << 22) | (BIT10 << 6))
+#define DCMD_ACT(rnk, row)	(3 | ((rnk) << 22) | ((row) << 6))
+#define DCMD_WR(rnk, col)	(4 | ((rnk) << 22) | ((col) << 6))
+#define DCMD_RD(rnk, col)	(5 | ((rnk) << 22) | ((col) << 6))
+#define DCMD_ZQCS(rnk)		(6 | ((rnk) << 22))
+#define DCMD_ZQCL(rnk)		(6 | ((rnk) << 22) | (BIT10 << 6))
+#define DCMD_NOP(rnk)		(7 | ((rnk) << 22))
+
+#define DDR3_EMRS1_DIC_40	(0)
+#define DDR3_EMRS1_DIC_34	(1)
+
+#define DDR3_EMRS1_RTTNOM_0	(0)
+#define DDR3_EMRS1_RTTNOM_60	(BIT2)
+#define DDR3_EMRS1_RTTNOM_120	(BIT6)
+#define DDR3_EMRS1_RTTNOM_40	(BIT6 | BIT2)
+#define DDR3_EMRS1_RTTNOM_20	(BIT9)
+#define DDR3_EMRS1_RTTNOM_30	(BIT9 | BIT2)
+
+#define DDR3_EMRS2_RTTWR_60	(BIT9)
+#define DDR3_EMRS2_RTTWR_120	(BIT10)
+
+/* BEGIN DDRIO Registers */
+
+/* DDR IOs & COMPs */
+#define DDRIODQ_BL_OFFSET	0x0800
+#define DDRIODQ_CH_OFFSET	((NUM_BYTE_LANES / 2) * DDRIODQ_BL_OFFSET)
+#define DDRIOCCC_CH_OFFSET	0x0800
+#define DDRCOMP_CH_OFFSET	0x0100
+
+/* CH0-BL01-DQ */
+#define DQOBSCKEBBCTL		0x0000
+#define DQDLLTXCTL		0x0004
+#define DQDLLRXCTL		0x0008
+#define DQMDLLCTL		0x000C
+#define B0RXIOBUFCTL		0x0010
+#define B0VREFCTL		0x0014
+#define B0RXOFFSET1		0x0018
+#define B0RXOFFSET0		0x001C
+#define B1RXIOBUFCTL		0x0020
+#define B1VREFCTL		0x0024
+#define B1RXOFFSET1		0x0028
+#define B1RXOFFSET0		0x002C
+#define DQDFTCTL		0x0030
+#define DQTRAINSTS		0x0034
+#define B1DLLPICODER0		0x0038
+#define B0DLLPICODER0		0x003C
+#define B1DLLPICODER1		0x0040
+#define B0DLLPICODER1		0x0044
+#define B1DLLPICODER2		0x0048
+#define B0DLLPICODER2		0x004C
+#define B1DLLPICODER3		0x0050
+#define B0DLLPICODER3		0x0054
+#define B1RXDQSPICODE		0x0058
+#define B0RXDQSPICODE		0x005C
+#define B1RXDQPICODER32		0x0060
+#define B1RXDQPICODER10		0x0064
+#define B0RXDQPICODER32		0x0068
+#define B0RXDQPICODER10		0x006C
+#define B01PTRCTL0		0x0070
+#define B01PTRCTL1		0x0074
+#define B01DBCTL0		0x0078
+#define B01DBCTL1		0x007C
+#define B0LATCTL0		0x0080
+#define B1LATCTL0		0x0084
+#define B01LATCTL1		0x0088
+#define B0ONDURCTL		0x008C
+#define B1ONDURCTL		0x0090
+#define B0OVRCTL		0x0094
+#define B1OVRCTL		0x0098
+#define DQCTL			0x009C
+#define B0RK2RKCHGPTRCTRL	0x00A0
+#define B1RK2RKCHGPTRCTRL	0x00A4
+#define DQRK2RKCTL		0x00A8
+#define DQRK2RKPTRCTL		0x00AC
+#define B0RK2RKLAT		0x00B0
+#define B1RK2RKLAT		0x00B4
+#define DQCLKALIGNREG0		0x00B8
+#define DQCLKALIGNREG1		0x00BC
+#define DQCLKALIGNREG2		0x00C0
+#define DQCLKALIGNSTS0		0x00C4
+#define DQCLKALIGNSTS1		0x00C8
+#define DQCLKGATE		0x00CC
+#define B0COMPSLV1		0x00D0
+#define B1COMPSLV1		0x00D4
+#define B0COMPSLV2		0x00D8
+#define B1COMPSLV2		0x00DC
+#define B0COMPSLV3		0x00E0
+#define B1COMPSLV3		0x00E4
+#define DQVISALANECR0TOP	0x00E8
+#define DQVISALANECR1TOP	0x00EC
+#define DQVISACONTROLCRTOP	0x00F0
+#define DQVISALANECR0BL		0x00F4
+#define DQVISALANECR1BL		0x00F8
+#define DQVISACONTROLCRBL	0x00FC
+#define DQTIMINGCTRL		0x010C
+
+/* CH0-ECC */
+#define ECCDLLTXCTL		0x2004
+#define ECCDLLRXCTL		0x2008
+#define ECCMDLLCTL		0x200C
+#define ECCB1DLLPICODER0	0x2038
+#define ECCB1DLLPICODER1	0x2040
+#define ECCB1DLLPICODER2	0x2048
+#define ECCB1DLLPICODER3	0x2050
+#define ECCB01DBCTL0		0x2078
+#define ECCB01DBCTL1		0x207C
+#define ECCCLKALIGNREG0		0x20B8
+#define ECCCLKALIGNREG1		0x20BC
+#define ECCCLKALIGNREG2		0x20C0
+
+/* CH0-CMD */
+#define CMDOBSCKEBBCTL		0x4800
+#define CMDDLLTXCTL		0x4808
+#define CMDDLLRXCTL		0x480C
+#define CMDMDLLCTL		0x4810
+#define CMDRCOMPODT		0x4814
+#define CMDDLLPICODER0		0x4820
+#define CMDDLLPICODER1		0x4824
+#define CMDCFGREG0		0x4840
+#define CMDPTRREG		0x4844
+#define CMDCLKALIGNREG0		0x4850
+#define CMDCLKALIGNREG1		0x4854
+#define CMDCLKALIGNREG2		0x4858
+#define CMDPMCONFIG0		0x485C
+#define CMDPMDLYREG0		0x4860
+#define CMDPMDLYREG1		0x4864
+#define CMDPMDLYREG2		0x4868
+#define CMDPMDLYREG3		0x486C
+#define CMDPMDLYREG4		0x4870
+#define CMDCLKALIGNSTS0		0x4874
+#define CMDCLKALIGNSTS1		0x4878
+#define CMDPMSTS0		0x487C
+#define CMDPMSTS1		0x4880
+#define CMDCOMPSLV		0x4884
+#define CMDBONUS0		0x488C
+#define CMDBONUS1		0x4890
+#define CMDVISALANECR0		0x4894
+#define CMDVISALANECR1		0x4898
+#define CMDVISACONTROLCR	0x489C
+#define CMDCLKGATE		0x48A0
+#define CMDTIMINGCTRL		0x48A4
+
+/* CH0-CLK-CTL */
+#define CCOBSCKEBBCTL		0x5800
+#define CCRCOMPIO		0x5804
+#define CCDLLTXCTL		0x5808
+#define CCDLLRXCTL		0x580C
+#define CCMDLLCTL		0x5810
+#define CCRCOMPODT		0x5814
+#define CCDLLPICODER0		0x5820
+#define CCDLLPICODER1		0x5824
+#define CCDDR3RESETCTL		0x5830
+#define CCCFGREG0		0x5838
+#define CCCFGREG1		0x5840
+#define CCPTRREG		0x5844
+#define CCCLKALIGNREG0		0x5850
+#define CCCLKALIGNREG1		0x5854
+#define CCCLKALIGNREG2		0x5858
+#define CCPMCONFIG0		0x585C
+#define CCPMDLYREG0		0x5860
+#define CCPMDLYREG1		0x5864
+#define CCPMDLYREG2		0x5868
+#define CCPMDLYREG3		0x586C
+#define CCPMDLYREG4		0x5870
+#define CCCLKALIGNSTS0		0x5874
+#define CCCLKALIGNSTS1		0x5878
+#define CCPMSTS0		0x587C
+#define CCPMSTS1		0x5880
+#define CCCOMPSLV1		0x5884
+#define CCCOMPSLV2		0x5888
+#define CCCOMPSLV3		0x588C
+#define CCBONUS0		0x5894
+#define CCBONUS1		0x5898
+#define CCVISALANECR0		0x589C
+#define CCVISALANECR1		0x58A0
+#define CCVISACONTROLCR		0x58A4
+#define CCCLKGATE		0x58A8
+#define CCTIMINGCTL		0x58AC
+
+/* COMP */
+#define CMPCTRL			0x6800
+#define SOFTRSTCNTL		0x6804
+#define MSCNTR			0x6808
+#define NMSCNTRL		0x680C
+#define LATCH1CTL		0x6814
+#define COMPVISALANECR0		0x681C
+#define COMPVISALANECR1		0x6820
+#define COMPVISACONTROLCR	0x6824
+#define COMPBONUS0		0x6830
+#define TCOCNTCTRL		0x683C
+#define DQANAODTPUCTL		0x6840
+#define DQANAODTPDCTL		0x6844
+#define DQANADRVPUCTL		0x6848
+#define DQANADRVPDCTL		0x684C
+#define DQANADLYPUCTL		0x6850
+#define DQANADLYPDCTL		0x6854
+#define DQANATCOPUCTL		0x6858
+#define DQANATCOPDCTL		0x685C
+#define CMDANADRVPUCTL		0x6868
+#define CMDANADRVPDCTL		0x686C
+#define CMDANADLYPUCTL		0x6870
+#define CMDANADLYPDCTL		0x6874
+#define CLKANAODTPUCTL		0x6880
+#define CLKANAODTPDCTL		0x6884
+#define CLKANADRVPUCTL		0x6888
+#define CLKANADRVPDCTL		0x688C
+#define CLKANADLYPUCTL		0x6890
+#define CLKANADLYPDCTL		0x6894
+#define CLKANATCOPUCTL		0x6898
+#define CLKANATCOPDCTL		0x689C
+#define DQSANAODTPUCTL		0x68A0
+#define DQSANAODTPDCTL		0x68A4
+#define DQSANADRVPUCTL		0x68A8
+#define DQSANADRVPDCTL		0x68AC
+#define DQSANADLYPUCTL		0x68B0
+#define DQSANADLYPDCTL		0x68B4
+#define DQSANATCOPUCTL		0x68B8
+#define DQSANATCOPDCTL		0x68BC
+#define CTLANADRVPUCTL		0x68C8
+#define CTLANADRVPDCTL		0x68CC
+#define CTLANADLYPUCTL		0x68D0
+#define CTLANADLYPDCTL		0x68D4
+#define CHNLBUFSTATIC		0x68F0
+#define COMPOBSCNTRL		0x68F4
+#define COMPBUFFDBG0		0x68F8
+#define COMPBUFFDBG1		0x68FC
+#define CFGMISCCH0		0x6900
+#define COMPEN0CH0		0x6904
+#define COMPEN1CH0		0x6908
+#define COMPEN2CH0		0x690C
+#define STATLEGEN0CH0		0x6910
+#define STATLEGEN1CH0		0x6914
+#define DQVREFCH0		0x6918
+#define CMDVREFCH0		0x691C
+#define CLKVREFCH0		0x6920
+#define DQSVREFCH0		0x6924
+#define CTLVREFCH0		0x6928
+#define TCOVREFCH0		0x692C
+#define DLYSELCH0		0x6930
+#define TCODRAMBUFODTCH0	0x6934
+#define CCBUFODTCH0		0x6938
+#define RXOFFSETCH0		0x693C
+#define DQODTPUCTLCH0		0x6940
+#define DQODTPDCTLCH0		0x6944
+#define DQDRVPUCTLCH0		0x6948
+#define DQDRVPDCTLCH0		0x694C
+#define DQDLYPUCTLCH0		0x6950
+#define DQDLYPDCTLCH0		0x6954
+#define DQTCOPUCTLCH0		0x6958
+#define DQTCOPDCTLCH0		0x695C
+#define CMDDRVPUCTLCH0		0x6968
+#define CMDDRVPDCTLCH0		0x696C
+#define CMDDLYPUCTLCH0		0x6970
+#define CMDDLYPDCTLCH0		0x6974
+#define CLKODTPUCTLCH0		0x6980
+#define CLKODTPDCTLCH0		0x6984
+#define CLKDRVPUCTLCH0		0x6988
+#define CLKDRVPDCTLCH0		0x698C
+#define CLKDLYPUCTLCH0		0x6990
+#define CLKDLYPDCTLCH0		0x6994
+#define CLKTCOPUCTLCH0		0x6998
+#define CLKTCOPDCTLCH0		0x699C
+#define DQSODTPUCTLCH0		0x69A0
+#define DQSODTPDCTLCH0		0x69A4
+#define DQSDRVPUCTLCH0		0x69A8
+#define DQSDRVPDCTLCH0		0x69AC
+#define DQSDLYPUCTLCH0		0x69B0
+#define DQSDLYPDCTLCH0		0x69B4
+#define DQSTCOPUCTLCH0		0x69B8
+#define DQSTCOPDCTLCH0		0x69BC
+#define CTLDRVPUCTLCH0		0x69C8
+#define CTLDRVPDCTLCH0		0x69CC
+#define CTLDLYPUCTLCH0		0x69D0
+#define CTLDLYPDCTLCH0		0x69D4
+#define FNLUPDTCTLCH0		0x69F0
+
+/* PLL */
+#define MPLLCTRL0		0x7800
+#define MPLLCTRL1		0x7808
+#define MPLLCSR0		0x7810
+#define MPLLCSR1		0x7814
+#define MPLLCSR2		0x7820
+#define MPLLDFT			0x7828
+#define MPLLMON0CTL		0x7830
+#define MPLLMON1CTL		0x7838
+#define MPLLMON2CTL		0x783C
+#define SFRTRIM			0x7850
+#define MPLLDFTOUT0		0x7858
+#define MPLLDFTOUT1		0x785C
+#define MASTERRSTN		0x7880
+#define PLLLOCKDEL		0x7884
+#define SFRDEL			0x7888
+#define CRUVISALANECR0		0x78F0
+#define CRUVISALANECR1		0x78F4
+#define CRUVISACONTROLCR	0x78F8
+#define IOSFVISALANECR0		0x78FC
+#define IOSFVISALANECR1		0x7900
+#define IOSFVISACONTROLCR	0x7904
+
+/* END DDRIO Registers */
+
+/* DRAM Specific Message Bus OpCodes */
+#define MSG_OP_DRAM_INIT	0x68
+#define MSG_OP_DRAM_WAKE	0xCA
+
+#define SAMPLE_SIZE		6
+
+/* must be less than this number to enable early deadband */
+#define EARLY_DB		0x12
+/* must be greater than this number to enable late deadband */
+#define LATE_DB			0x34
+
+#define CHX_REGS		(11 * 4)
+#define FULL_CLK		128
+#define HALF_CLK		64
+#define QRTR_CLK		32
+
+#define MCEIL(num, den)		((uint8_t)((num + den - 1) / den))
+#define MMAX(a, b)		((a) > (b) ? (a) : (b))
+#define DEAD_LOOP()		for (;;);
+
+#define MIN_RDQS_EYE		10	/* in PI Codes */
+#define MIN_VREF_EYE		10	/* in VREF Codes */
+/* how many RDQS codes to jump while margining */
+#define RDQS_STEP		1
+/* how many VREF codes to jump while margining */
+#define VREF_STEP		1
+/* offset into "vref_codes[]" for minimum allowed VREF setting */
+#define VREF_MIN		0x00
+/* offset into "vref_codes[]" for maximum allowed VREF setting */
+#define VREF_MAX		0x3F
+#define RDQS_MIN		0x00	/* minimum RDQS delay value */
+#define RDQS_MAX		0x3F	/* maximum RDQS delay value */
+
+/* how many WDQ codes to jump while margining */
+#define WDQ_STEP		1
+
+enum {
+	B,	/* BOTTOM VREF */
+	T	/* TOP VREF */
+};
+
+enum {
+	L,	/* LEFT RDQS */
+	R	/* RIGHT RDQS */
+};
+
+/* Memory Options */
+
+/* enable STATIC timing settings for RCVN (BACKUP_MODE) */
+#undef BACKUP_RCVN
+/* enable STATIC timing settings for WDQS (BACKUP_MODE) */
+#undef BACKUP_WDQS
+/* enable STATIC timing settings for RDQS (BACKUP_MODE) */
+#undef BACKUP_RDQS
+/* enable STATIC timing settings for WDQ (BACKUP_MODE) */
+#undef BACKUP_WDQ
+/* enable *COMP overrides (BACKUP_MODE) */
+#undef BACKUP_COMPS
+/* enable the RD_TRAIN eye check */
+#undef RX_EYE_CHECK
+
+/* enable Host to Memory Clock Alignment */
+#define HMC_TEST
+/* enable multi-rank support via rank2rank sharing */
+#define R2R_SHARING
+/* disable signals not used in 16bit mode of DDRIO */
+#define FORCE_16BIT_DDRIO
+
+#define PLATFORM_ID		1
+
+void clear_self_refresh(struct mrc_params *mrc_params);
+void prog_ddr_timing_control(struct mrc_params *mrc_params);
+void prog_decode_before_jedec(struct mrc_params *mrc_params);
+void perform_ddr_reset(struct mrc_params *mrc_params);
+void ddrphy_init(struct mrc_params *mrc_params);
+void perform_jedec_init(struct mrc_params *mrc_params);
+void set_ddr_init_complete(struct mrc_params *mrc_params);
+void restore_timings(struct mrc_params *mrc_params);
+void default_timings(struct mrc_params *mrc_params);
+void rcvn_cal(struct mrc_params *mrc_params);
+void wr_level(struct mrc_params *mrc_params);
+void prog_page_ctrl(struct mrc_params *mrc_params);
+void rd_train(struct mrc_params *mrc_params);
+void wr_train(struct mrc_params *mrc_params);
+void store_timings(struct mrc_params *mrc_params);
+void enable_scrambling(struct mrc_params *mrc_params);
+void prog_ddr_control(struct mrc_params *mrc_params);
+void prog_dra_drb(struct mrc_params *mrc_params);
+void perform_wake(struct mrc_params *mrc_params);
+void change_refresh_period(struct mrc_params *mrc_params);
+void set_auto_refresh(struct mrc_params *mrc_params);
+void ecc_enable(struct mrc_params *mrc_params);
+void memory_test(struct mrc_params *mrc_params);
+void lock_registers(struct mrc_params *mrc_params);
+
+#endif /* _SMC_H_ */
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
                   ` (4 preceding siblings ...)
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:25   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 7/9] fdtdec: Add compatible id and string for Intel Quark MRC Bin Meng
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

Turn on the Memory Reference code build in the quark Makefile.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
---

 arch/x86/cpu/quark/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
index 168c1e6..e87b424 100644
--- a/arch/x86/cpu/quark/Makefile
+++ b/arch/x86/cpu/quark/Makefile
@@ -5,4 +5,5 @@
 #
 
 obj-y += car.o dram.o msg_port.o quark.o
+obj-y += mrc.o mrc_util.o hte.o smc.o
 obj-$(CONFIG_PCI) += pci.o
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 7/9] fdtdec: Add compatible id and string for Intel Quark MRC
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
                   ` (5 preceding siblings ...)
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:25   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 8/9] dt-bindings: Add Intel Quark MRC bindings Bin Meng
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init() Bin Meng
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

Add COMPAT_INTEL_QRK_MRC and "intel,quark-mrc" so that fdtdec can
decode Intel Quark MRC node.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
---

 include/fdtdec.h | 1 +
 lib/fdtdec.c     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/fdtdec.h b/include/fdtdec.h
index 8c2bd21..a0cd886 100644
--- a/include/fdtdec.h
+++ b/include/fdtdec.h
@@ -174,6 +174,7 @@ enum fdt_compat_id {
 	COMPAT_INTEL_GMA,		/* Intel Graphics Media Accelerator */
 	COMPAT_AMS_AS3722,		/* AMS AS3722 PMIC */
 	COMPAT_INTEL_ICH_SPI,		/* Intel ICH7/9 SPI controller */
+	COMPAT_INTEL_QRK_MRC,		/* Intel Quark MRC */
 
 	COMPAT_COUNT,
 };
diff --git a/lib/fdtdec.c b/lib/fdtdec.c
index e989241..1fef9af 100644
--- a/lib/fdtdec.c
+++ b/lib/fdtdec.c
@@ -84,6 +84,7 @@ static const char * const compat_names[COMPAT_COUNT] = {
 	COMPAT(INTEL_GMA, "intel,gma"),
 	COMPAT(AMS_AS3722, "ams,as3722"),
 	COMPAT(INTEL_ICH_SPI, "intel,ich-spi"),
+	COMPAT(INTEL_QRK_MRC, "intel,quark-mrc"),
 };
 
 const char *fdtdec_get_compatible(enum fdt_compat_id id)
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 8/9] dt-bindings: Add Intel Quark MRC bindings
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
                   ` (6 preceding siblings ...)
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 7/9] fdtdec: Add compatible id and string for Intel Quark MRC Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:25   ` Simon Glass
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init() Bin Meng
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

Add standard dt-bindings macros to be used by Intel Quark MRC node.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
---

 include/dt-bindings/mrc/quark.h | 83 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100644 include/dt-bindings/mrc/quark.h

diff --git a/include/dt-bindings/mrc/quark.h b/include/dt-bindings/mrc/quark.h
new file mode 100644
index 0000000..e3ca8a2
--- /dev/null
+++ b/include/dt-bindings/mrc/quark.h
@@ -0,0 +1,83 @@
+/*
+ * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
+ *
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ * Intel Quark MRC bindings include several properties
+ * as part of an Intel Quark MRC node. In most cases,
+ * the value of these properties uses the standard values
+ * defined in this header.
+ */
+
+#ifndef _DT_BINDINGS_QRK_MRC_H_
+#define _DT_BINDINGS_QRK_MRC_H_
+
+/* MRC platform data flags */
+#define MRC_FLAG_ECC_EN		0x00000001
+#define MRC_FLAG_SCRAMBLE_EN	0x00000002
+#define MRC_FLAG_MEMTEST_EN	0x00000004
+/* 0b DDR "fly-by" topology else 1b DDR "tree" topology */
+#define MRC_FLAG_TOP_TREE_EN	0x00000008
+/* If set ODR signal is asserted to DRAM devices on writes */
+#define MRC_FLAG_WR_ODT_EN	0x00000010
+
+/* DRAM width */
+#define DRAM_WIDTH_X8		0
+#define DRAM_WIDTH_X16		1
+#define DRAM_WIDTH_X32		2
+
+/* DRAM speed */
+#define DRAM_FREQ_800		0
+#define DRAM_FREQ_1066		1
+
+/* DRAM type */
+#define DRAM_TYPE_DDR3		0
+#define DRAM_TYPE_DDR3L		1
+
+/* DRAM rank mask */
+#define DRAM_RANK(n)		(1 << (n))
+
+/* DRAM channel mask */
+#define DRAM_CHANNEL(n)		(1 << (n))
+
+/* DRAM channel width */
+#define DRAM_CHANNEL_WIDTH_X8	0
+#define DRAM_CHANNEL_WIDTH_X16	1
+#define DRAM_CHANNEL_WIDTH_X32	2
+
+/* DRAM address mode */
+#define DRAM_ADDR_MODE0		0
+#define DRAM_ADDR_MODE1		1
+#define DRAM_ADDR_MODE2		2
+
+/* DRAM refresh rate */
+#define DRAM_REFRESH_RATE_195US	1
+#define DRAM_REFRESH_RATE_39US	2
+#define DRAM_REFRESH_RATE_785US	3
+
+/* DRAM SR temprature range */
+#define DRAM_SRT_RANGE_NORMAL	0
+#define DRAM_SRT_RANGE_EXTENDED	1
+
+/* DRAM ron value */
+#define DRAM_RON_34OHM		0
+#define DRAM_RON_40OHM		1
+
+/* DRAM rtt nom value */
+#define DRAM_RTT_NOM_40OHM	0
+#define DRAM_RTT_NOM_60OHM	1
+#define DRAM_RTT_NOM_120OHM	2
+
+/* DRAM rd odt value */
+#define DRAM_RD_ODT_OFF		0
+#define DRAM_RD_ODT_60OHM	1
+#define DRAM_RD_ODT_120OHM	2
+#define DRAM_RD_ODT_180OHM	3
+
+/* DRAM density */
+#define DRAM_DENSITY_512M	0
+#define DRAM_DENSITY_1G		1
+#define DRAM_DENSITY_2G		2
+#define DRAM_DENSITY_4G		3
+
+#endif /* _DT_BINDINGS_QRK_MRC_H_ */
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init()
  2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
                   ` (7 preceding siblings ...)
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 8/9] dt-bindings: Add Intel Quark MRC bindings Bin Meng
@ 2015-02-03 11:45 ` Bin Meng
  2015-02-04 16:25   ` Simon Glass
  8 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-03 11:45 UTC (permalink / raw)
  To: u-boot

Now that we have added Quark MRC codes, call MRC in dram_init() so
that DRAM can be initialized on a Quark based board.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>

---

 arch/x86/cpu/quark/dram.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++-
 arch/x86/dts/galileo.dts  | 25 ++++++++++++
 2 files changed, 120 insertions(+), 2 deletions(-)

diff --git a/arch/x86/cpu/quark/dram.c b/arch/x86/cpu/quark/dram.c
index fbdc3cd..3ed1d20 100644
--- a/arch/x86/cpu/quark/dram.c
+++ b/arch/x86/cpu/quark/dram.c
@@ -5,15 +5,108 @@
  */
 
 #include <common.h>
+#include <errno.h>
+#include <fdtdec.h>
 #include <asm/post.h>
+#include <asm/arch/mrc.h>
 #include <asm/arch/quark.h>
 
 DECLARE_GLOBAL_DATA_PTR;
 
+static int mrc_configure_params(struct mrc_params *mrc_params)
+{
+	const void *blob = gd->fdt_blob;
+	int node;
+	int mrc_flags;
+
+	node = fdtdec_next_compatible(blob, 0, COMPAT_INTEL_QRK_MRC);
+	if (node < 0) {
+		debug("%s: Cannot find MRC node\n", __func__);
+		return -EINVAL;
+	}
+
+	/*
+	 * TODO:
+	 *
+	 * We need support fast boot (MRC cache) in the future.
+	 *
+	 * Set boot mode to cold boot for now
+	 */
+	mrc_params->boot_mode = BM_COLD;
+
+	/*
+	 * TODO:
+	 *
+	 * We need determine ECC by pin strap state
+	 *
+	 * Disable ECC by default for now
+	 */
+	mrc_params->ecc_enables = 0;
+
+	mrc_flags = fdtdec_get_int(blob, node, "flags", 0);
+	if (mrc_flags & MRC_FLAG_SCRAMBLE_EN)
+		mrc_params->scrambling_enables = 1;
+	else
+		mrc_params->scrambling_enables = 0;
+
+	mrc_params->dram_width = fdtdec_get_int(blob, node, "dram-width", 0);
+	mrc_params->ddr_speed = fdtdec_get_int(blob, node, "dram-speed", 0);
+	mrc_params->ddr_type = fdtdec_get_int(blob, node, "dram-type", 0);
+
+	mrc_params->rank_enables = fdtdec_get_int(blob, node, "rank-mask", 0);
+	mrc_params->channel_enables = fdtdec_get_int(blob, node,
+		"chan-mask", 0);
+	mrc_params->channel_width = fdtdec_get_int(blob, node,
+		"chan-width", 0);
+	mrc_params->address_mode = fdtdec_get_int(blob, node, "addr-mode", 0);
+
+	mrc_params->refresh_rate = fdtdec_get_int(blob, node,
+		"refresh-rate", 0);
+	mrc_params->sr_temp_range = fdtdec_get_int(blob, node,
+		"sr-temp-range", 0);
+	mrc_params->ron_value = fdtdec_get_int(blob, node,
+		"ron-value", 0);
+	mrc_params->rtt_nom_value = fdtdec_get_int(blob, node,
+		"rtt-nom-value", 0);
+	mrc_params->rd_odt_value = fdtdec_get_int(blob, node,
+		"rd-odt-value", 0);
+
+	mrc_params->params.density = fdtdec_get_int(blob, node,
+		"dram-density", 0);
+	mrc_params->params.cl = fdtdec_get_int(blob, node, "dram-cl", 0);
+	mrc_params->params.ras = fdtdec_get_int(blob, node, "dram-ras", 0);
+	mrc_params->params.wtr = fdtdec_get_int(blob, node, "dram-wtr", 0);
+	mrc_params->params.rrd = fdtdec_get_int(blob, node, "dram-rrd", 0);
+	mrc_params->params.faw = fdtdec_get_int(blob, node, "dram-faw", 0);
+
+	debug("MRC dram_width %d\n", mrc_params->dram_width);
+	debug("MRC rank_enables %d\n", mrc_params->rank_enables);
+	debug("MRC ddr_speed %d\n", mrc_params->ddr_speed);
+	debug("MRC flags: %s\n",
+	      (mrc_params->scrambling_enables) ? "SCRAMBLE_EN" : "");
+
+	debug("MRC density=%d tCL=%d tRAS=%d tWTR=%d tRRD=%d tFAW=%d\n",
+	      mrc_params->params.density, mrc_params->params.cl,
+	      mrc_params->params.ras, mrc_params->params.wtr,
+	      mrc_params->params.rrd, mrc_params->params.faw);
+
+	return 0;
+}
+
 int dram_init(void)
 {
-	/* hardcode the DRAM size for now */
-	gd->ram_size = DRAM_MAX_SIZE;
+	struct mrc_params mrc_params;
+	int ret;
+
+	memset(&mrc_params, 0, sizeof(struct mrc_params));
+	ret = mrc_configure_params(&mrc_params);
+	if (ret)
+		return ret;
+
+	/* Call MRC */
+	mrc(&mrc_params);
+
+	gd->ram_size = mrc_params.mem_size;
 	post_code(POST_DRAM);
 
 	return 0;
diff --git a/arch/x86/dts/galileo.dts b/arch/x86/dts/galileo.dts
index 14a19c3..d462221 100644
--- a/arch/x86/dts/galileo.dts
+++ b/arch/x86/dts/galileo.dts
@@ -6,6 +6,8 @@
 
 /dts-v1/;
 
+#include <dt-bindings/mrc/quark.h>
+
 /include/ "skeleton.dtsi"
 
 / {
@@ -20,6 +22,29 @@
 		stdout-path = &pciuart0;
 	};
 
+	mrc {
+		compatible = "intel,quark-mrc";
+		flags = <MRC_FLAG_SCRAMBLE_EN>;
+		dram-width = <DRAM_WIDTH_X8>;
+		dram-speed = <DRAM_FREQ_800>;
+		dram-type = <DRAM_TYPE_DDR3>;
+		rank-mask = <DRAM_RANK(0)>;
+		chan-mask = <DRAM_CHANNEL(0)>;
+		chan-width = <DRAM_CHANNEL_WIDTH_X16>;
+		addr-mode = <DRAM_ADDR_MODE0>;
+		refresh-rate = <DRAM_REFRESH_RATE_785US>;
+		sr-temp-range = <DRAM_SRT_RANGE_NORMAL>;
+		ron-value = <DRAM_RON_34OHM>;
+		rtt-nom-value = <DRAM_RTT_NOM_120OHM>;
+		rd-odt-value = <DRAM_RD_ODT_OFF>;
+		dram-density = <DRAM_DENSITY_1G>;
+		dram-cl = <6>;
+		dram-ras = <0x0000927c>;
+		dram-wtr = <0x00002710>;
+		dram-rrd = <0x00002710>;
+		dram-faw = <0x00009c40>;
+	};
+
 	pci {
 		#address-cells = <3>;
 		#size-cells = <2>;
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ Bin Meng
@ 2015-02-04 16:24   ` Simon Glass
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:24 UTC (permalink / raw)
  To: u-boot

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> We should allow the value of TSC_FREQ_IN_MHZ to be overridden by
> the one in arch/cpu/<xxx>/Kconfig.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> ---
>
>  arch/x86/Kconfig | 40 ++++++++++++++++++++--------------------
>  1 file changed, 20 insertions(+), 20 deletions(-)
>

Acked-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 2/9] x86: quark: Bypass TSC calibration
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 2/9] x86: quark: Bypass TSC calibration Bin Meng
@ 2015-02-04 16:24   ` Simon Glass
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:24 UTC (permalink / raw)
  To: u-boot

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> For some unknown reason, the TSC calibration via PIT does not work on
> Quark. Enable bypassing TSC calibration and override TSC_FREQ_IN_MHZ
> to 400 per Quark datasheet in the Kconfig.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> ---
>
>  arch/x86/cpu/quark/Kconfig | 5 +++++
>  1 file changed, 5 insertions(+)

Acked-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines Bin Meng
@ 2015-02-04 16:24   ` Simon Glass
  2015-02-05  8:45     ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:24 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Add the main routines for Quark Memory Reference Code (MRC).
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>
> ---
> The are 24 checkpatch warnings in this patch, which is:
>
> warning: arch/x86/cpu/quark/mrc.c,43: line over 80 characters
> ...
>
> I intentionally leave it as is now, as fixing these warnings
> make the mrc initialization table a little bit harder to read.
>
>  arch/x86/cpu/quark/mrc.c              | 206 ++++++++++++++++++++++++++++++++++
>  arch/x86/include/asm/arch-quark/mrc.h | 189 +++++++++++++++++++++++++++++++
>  2 files changed, 395 insertions(+)
>  create mode 100644 arch/x86/cpu/quark/mrc.c
>  create mode 100644 arch/x86/include/asm/arch-quark/mrc.h
>
> diff --git a/arch/x86/cpu/quark/mrc.c b/arch/x86/cpu/quark/mrc.c
> new file mode 100644
> index 0000000..6a82519
> --- /dev/null
> +++ b/arch/x86/cpu/quark/mrc.c
> @@ -0,0 +1,206 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +/*
> + * This is the main Quark Memory Reference Code (MRC)
> + *
> + * These functions are generic and should work for any Quark based board.

Quark-based

> + *
> + * MRC requires two data structures to be passed in which are initialized by
> + * mrc_adjust_params().
> + *
> + * The basic flow is as follows:
> + * 01) Check for supported DDR speed configuration
> + * 02) Set up Memory Manager buffer as pass-through (POR)
> + * 03) Set Channel Interleaving Mode and Channel Stride to the most aggressive
> + *     setting possible
> + * 04) Set up the Memory Controller logic
> + * 05) Set up the DDR_PHY logic
> + * 06) Initialise the DRAMs (JEDEC)
> + * 07) Perform the Receive Enable Calibration algorithm
> + * 08) Perform the Write Leveling algorithm
> + * 09) Perform the Read Training algorithm (includes internal Vref)
> + * 10) Perform the Write Training algorithm
> + * 11) Set Channel Interleaving Mode and Channel Stride to the desired settings
> + *
> + * Dunit configuration based on Valleyview MRC.

What is Dunit?

> + */
> +
> +#include <common.h>
> +#include <asm/arch/mrc.h>
> +#include <asm/arch/msg_port.h>
> +#include "mrc_util.h"
> +#include "smc.h"
> +
> +static const struct mem_init init[] = {
> +       { 0x0101, BM_COLD | BM_FAST | BM_WARM | BM_S3, clear_self_refresh       },
> +       { 0x0200, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_timing_control  },
> +       { 0x0103, BM_COLD | BM_FAST                  , prog_decode_before_jedec },
> +       { 0x0104, BM_COLD | BM_FAST                  , perform_ddr_reset        },
> +       { 0x0300, BM_COLD | BM_FAST           | BM_S3, ddrphy_init              },
> +       { 0x0400, BM_COLD | BM_FAST                  , perform_jedec_init       },
> +       { 0x0105, BM_COLD | BM_FAST                  , set_ddr_init_complete    },
> +       { 0x0106,           BM_FAST | BM_WARM | BM_S3, restore_timings          },
> +       { 0x0106, BM_COLD                            , default_timings          },
> +       { 0x0500, BM_COLD                            , rcvn_cal                 },
> +       { 0x0600, BM_COLD                            , wr_level                 },
> +       { 0x0120, BM_COLD                            , prog_page_ctrl           },
> +       { 0x0700, BM_COLD                            , rd_train                 },
> +       { 0x0800, BM_COLD                            , wr_train                 },
> +       { 0x010B, BM_COLD                            , store_timings            },
> +       { 0x010C, BM_COLD | BM_FAST | BM_WARM | BM_S3, enable_scrambling        },
> +       { 0x010D, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_control         },
> +       { 0x010E, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_dra_drb             },
> +       { 0x010F,                     BM_WARM | BM_S3, perform_wake             },
> +       { 0x0110, BM_COLD | BM_FAST | BM_WARM | BM_S3, change_refresh_period    },
> +       { 0x0111, BM_COLD | BM_FAST | BM_WARM | BM_S3, set_auto_refresh         },
> +       { 0x0112, BM_COLD | BM_FAST | BM_WARM | BM_S3, ecc_enable               },
> +       { 0x0113, BM_COLD | BM_FAST                  , memory_test              },
> +       { 0x0114, BM_COLD | BM_FAST | BM_WARM | BM_S3, lock_registers           }

What are the hex codes at the start? Ah I see they are post codes (we
don't particularly need them, I'm just asking). Should there be
#defines for these? Also how come they use all 16 bits?

> +};
> +
> +/* Adjust configuration parameters before initialization sequence */
> +static void mrc_adjust_params(struct mrc_params *mrc_params)
> +{
> +       const struct dram_params *dram_params;
> +       uint8_t dram_width;
> +       uint32_t rank_enables;
> +       uint32_t channel_width;
> +
> +       ENTERFN();

What is this?

> +
> +       /* initially expect success */
> +       mrc_params->status = MRC_SUCCESS;
> +
> +       dram_width = mrc_params->dram_width;
> +       rank_enables = mrc_params->rank_enables;
> +       channel_width = mrc_params->channel_width;
> +
> +       /*
> +        * Setup board layout (must be reviewed as is selecting static timings)
> +        * 0 == R0 (DDR3 x16), 1 == R1 (DDR3 x16),
> +        * 2 == DV (DDR3 x8), 3 == SV (DDR3 x8).
> +        */
> +       if (dram_width == X8)
> +               mrc_params->board_id = 2;       /* select x8 layout */
> +       else
> +               mrc_params->board_id = 0;       /* select x16 layout */
> +
> +       /* initially no memory */
> +       mrc_params->mem_size = 0;
> +
> +       /* begin of channel settings */
> +       dram_params = &mrc_params->params;
> +
> +       /*
> +        * Determine Column Bits:
> +        *
> +        * Column: 11 for 8Gbx8, else 10
> +        */
> +       mrc_params->column_bits[0] =
> +               ((dram_params[0].density == 4) &&
> +               (dram_width == X8)) ? (11) : (10);
> +
> +       /*
> +        * Determine Row Bits:

Can we capitalise only the first word in these comments?

> +        *
> +        * 512Mbx16=12 512Mbx8=13
> +        * 1Gbx16=13   1Gbx8=14
> +        * 2Gbx16=14   2Gbx8=15
> +        * 4Gbx16=15   4Gbx8=16
> +        * 8Gbx16=16   8Gbx8=16
> +        */
> +       mrc_params->row_bits[0] = 12 + (dram_params[0].density) +
> +               (((dram_params[0].density < 4) &&
> +               (dram_width == X8)) ? (1) : (0));
> +
> +       /*
> +        * Determine Per Channel Memory Size:

per-channel

> +        *
> +        * (For 2 RANKs, multiply by 2)
> +        * (For 16 bit data bus, divide by 2)
> +        *
> +        * DENSITY WIDTH MEM_AVAILABLE
> +        * 512Mb   x16   0x008000000 ( 128MB)
> +        * 512Mb   x8    0x010000000 ( 256MB)
> +        * 1Gb     x16   0x010000000 ( 256MB)
> +        * 1Gb     x8    0x020000000 ( 512MB)
> +        * 2Gb     x16   0x020000000 ( 512MB)
> +        * 2Gb     x8    0x040000000 (1024MB)
> +        * 4Gb     x16   0x040000000 (1024MB)
> +        * 4Gb     x8    0x080000000 (2048MB)
> +        */
> +       mrc_params->channel_size[0] = (1 << dram_params[0].density);
> +       mrc_params->channel_size[0] *= (dram_width == X8) ? (2) : (1);
> +       mrc_params->channel_size[0] *= (rank_enables == 0x3) ? (2) : (1);
> +       mrc_params->channel_size[0] *= (channel_width == X16) ? (1) : (2);

Remove () around 2 and 1.

> +
> +       /* Determine memory size (convert number of 64MB/512Mb units) */
> +       mrc_params->mem_size += mrc_params->channel_size[0] << 26;
> +
> +       LEAVEFN();

?

> +}
> +
> +static void mrc_init(struct mrc_params *mrc_params)
> +{
> +       int i;
> +
> +       ENTERFN();
> +
> +       DPF(D_INFO, "mrc_init build %s %s\n", __DATE__, __TIME__);

debug() I think, and below.

> +
> +       /* MRC started */
> +       mrc_post_code(0x01, 0x00);
> +
> +       if (mrc_params->boot_mode != BM_COLD) {
> +               if (mrc_params->ddr_speed != mrc_params->timings.ddr_speed) {
> +                       /* full training required as frequency changed */
> +                       mrc_params->boot_mode = BM_COLD;
> +               }
> +       }
> +
> +       for (i = 0; i < ARRAY_SIZE(init); i++) {
> +               uint64_t my_tsc;
> +
> +               if (mrc_params->boot_mode & init[i].boot_path) {
> +                       uint8_t major = init[i].post_code >> 8 & 0xFF;
> +                       uint8_t minor = init[i].post_code >> 0 & 0xFF;

Can we stick with lower case hex, and below?

> +                       mrc_post_code(major, minor);
> +
> +                       my_tsc = rdtsc();
> +                       init[i].init_fn(mrc_params);
> +                       DPF(D_TIME, "Execution time %llx", rdtsc() - my_tsc);
> +               }
> +       }
> +
> +       /* display the timings */
> +       print_timings(mrc_params);
> +
> +       /* MRC complete */
> +       mrc_post_code(0x01, 0xFF);
> +
> +       LEAVEFN();
> +}
> +
> +void mrc(struct mrc_params *mrc_params)
> +{
> +       ENTERFN();
> +
> +       DPF(D_INFO, "MRC Version %04x %s %s\n",
> +           MRC_VERSION, __DATE__, __TIME__);

Can you reformat so more args on first line?

> +
> +       /* Set up the data structures used by mrc_init() */
> +       mrc_adjust_params(mrc_params);
> +
> +       /* Initialize system memory */
> +       mrc_init(mrc_params);
> +
> +       LEAVEFN();
> +}
> diff --git a/arch/x86/include/asm/arch-quark/mrc.h b/arch/x86/include/asm/arch-quark/mrc.h
> new file mode 100644
> index 0000000..690a800
> --- /dev/null
> +++ b/arch/x86/include/asm/arch-quark/mrc.h
> @@ -0,0 +1,189 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#ifndef _MRC_H_
> +#define _MRC_H_
> +
> +/* MRC Version */

I think you can drop that comment!

> +#define MRC_VERSION    0x0111
> +
> +/* architectural definitions */
> +#define NUM_CHANNELS   1       /* number of channels */
> +#define NUM_RANKS      2       /* number of ranks per channel */
> +#define NUM_BYTE_LANES 4       /* number of byte lanes per channel */
> +
> +/* software limitations */
> +#define MAX_CHANNELS   1
> +#define MAX_RANKS      2
> +#define MAX_BYTE_LANES 4
> +
> +/* only to mock MrcWrapper */

What does this mean?

> +#define MAX_SOCKETS    1
> +#define MAX_SIDES      1
> +#define MAX_ROWS       (MAX_SIDES * MAX_SOCKETS)
> +
> +/* Specify DRAM of nenory channel width */

memory

Also this doesn't quite make sense - can you please reword it?

> +enum {
> +       X8,     /* DRAM width */
> +       X16,    /* DRAM width & Channel Width */
> +       X32     /* Channel Width */
> +};
> +
> +/* Specify DRAM speed */
> +enum {
> +       DDRFREQ_800,
> +       DDRFREQ_1066
> +};
> +
> +/* Specify DRAM type */
> +enum {
> +       DDR3,
> +       DDR3L
> +};
> +
> +/*
> + * density: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb

should either have @density in this header and all the others here
too. Or move this comment below above density.

> + * cl is DRAM CAS Latency in clocks
> + * All other timings are in picoseconds
> + *
> + * Refer to JEDEC spec (or DRAM datasheet) when changing these values.
> + */
> +struct dram_params {
> +       uint8_t density;
> +       /* CAS latency in clocks */
> +       uint8_t cl;
> +       /* ACT to PRE command period */
> +       uint32_t ras;
> +       /*
> +        * Delay from start of internal write transaction to
> +        * internal read command
> +        */
> +       uint32_t wtr;
> +       /* ACT to ACT command period (JESD79 specific to page size 1K/2K) */
> +       uint32_t rrd;
> +       /* Four activate window (JESD79 specific to page size 1K/2K) */
> +       uint32_t faw;
> +};
> +
> +/*
> + * Delay configuration for individual signals
> + * Vref setting
> + * Scrambler seed

What do the above two lines mean?

> + */
> +struct mrc_timings {
> +       uint32_t rcvn[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       uint32_t rdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       uint32_t wdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       uint32_t wdq[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       uint32_t vref[NUM_CHANNELS][NUM_BYTE_LANES];
> +       uint32_t wctl[NUM_CHANNELS][NUM_RANKS];
> +       uint32_t wcmd[NUM_CHANNELS];
> +       uint32_t scrambler_seed;

Comments for the above?

> +       /* need to save for the case of frequency change */
> +       uint8_t ddr_speed;
> +};
> +
> +/* Boot mode defined as bit mask (1<<n) */
> +enum {
> +       BM_UNKNOWN,
> +       BM_COLD = 1,    /* full training */
> +       BM_FAST = 2,    /* restore timing parameters */
> +       BM_S3   = 4,    /* resume from S3 */
> +       BM_WARM = 8
> +};
> +
> +/* MRC execution status */
> +#define MRC_SUCCESS    0       /* initialization ok */
> +#define MRC_E_MEMTEST  1       /* memtest failed */
> +
> +/* Input/output/context parameters for Memory Reference Code */
> +struct mrc_params {
> +       /* Global Settings */
> +
> +       /* BM_COLD, BM_FAST, BM_WARM, BM_S3 */
> +       uint32_t boot_mode;
> +       uint8_t first_run;
> +
> +       /* DRAM Parameters */
> +

Remove blank line

> +       uint8_t dram_width;             /* x8, x16 */
> +       uint8_t ddr_speed;              /* DDRFREQ_800, DDRFREQ_1066 */
> +       uint8_t ddr_type;               /* DDR3, DDR3L */
> +       uint8_t ecc_enables;            /* 0, 1 (memory size reduced to 7/8) */
> +       uint8_t scrambling_enables;     /* 0, 1 */
> +       /* 1, 3 (1'st rank has to be populated if 2'nd rank present) */
> +       uint32_t rank_enables;
> +       uint32_t channel_enables;       /* 1 only */
> +       uint32_t channel_width;         /* x16 only */
> +       /* 0, 1, 2 (mode 2 forced if ecc enabled) */
> +       uint32_t address_mode;
> +       /* REFRESH_RATE: 1=1.95us, 2=3.9us, 3=7.8us, others=RESERVED */
> +       uint8_t refresh_rate;
> +       /* SR_TEMP_RANGE: 0=normal, 1=extended, others=RESERVED */
> +       uint8_t sr_temp_range;
> +       /*
> +        * RON_VALUE: 0=34ohm, 1=40ohm, others=RESERVED
> +        * (select MRS1.DIC driver impedance control)
> +        */
> +       uint8_t ron_value;
> +       /* RTT_NOM_VALUE: 0=40ohm, 1=60ohm, 2=120ohm, others=RESERVED */
> +       uint8_t rtt_nom_value;
> +       /* RD_ODT_VALUE: 0=off, 1=60ohm, 2=120ohm, 3=180ohm, others=RESERVED */
> +       uint8_t rd_odt_value;
> +       struct dram_params params;
> +
> +       /* Internally Used */

I think I know what this means? It's unfortunate to have
input/output/working data in the same structure but this seems to be
the approach taken, so let's keep it. But can you add a comment above
the struct saying how it is split into multiple parts?

> +
> +       /* internally used for board layout (use x8 or x16 memory) */
> +       uint32_t board_id;
> +       /* when set hte reconfiguration requested */
> +       uint32_t hte_setup:1;
> +       uint32_t menu_after_mrc:1;
> +       uint32_t power_down_disable:1;
> +       uint32_t tune_rcvn:1;

Should these be bool? I'm not sure the :1 helps much - are you trying
to save memory?

> +       uint32_t channel_size[NUM_CHANNELS];
> +       uint32_t column_bits[NUM_CHANNELS];
> +       uint32_t row_bits[NUM_CHANNELS];
> +       /* register content saved during training */
> +       uint32_t mrs1;
> +
> +       /* Output */
> +
> +       /* initialization result (non zero specifies error code) */
> +       uint32_t status;
> +       /* total memory size in bytes (excludes ECC banks) */
> +       uint32_t mem_size;
> +       /* training results (also used on input) */
> +       struct mrc_timings timings;
> +};
> +

This one needs comments:

> +struct mem_init {
> +       uint16_t post_code;
> +       uint16_t boot_path;
> +       void (*init_fn)(struct mrc_params *mrc_params);
> +};
> +
> +/* MRC platform data flags */
> +#define MRC_FLAG_ECC_EN                0x00000001
> +#define MRC_FLAG_SCRAMBLE_EN   0x00000002
> +#define MRC_FLAG_MEMTEST_EN    0x00000004
> +/* 0b DDR "fly-by" topology else 1b DDR "tree" topology */
> +#define MRC_FLAG_TOP_TREE_EN   0x00000008
> +/* If set ODR signal is asserted to DRAM devices on writes */
> +#define MRC_FLAG_WR_ODT_EN     0x00000010
> +
> +/**
> + * mrc - Memory Reference Code entry routine
> + *
> + * @mrc_params: parameters for MRC
> + */
> +void mrc(struct mrc_params *mrc_params);

How about sdram_init() or mrc_init()?

> +
> +#endif /* _MRC_H_ */
> --
> 1.8.2.1
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC Bin Meng
@ 2015-02-04 16:24   ` Simon Glass
  2015-02-05 14:25     ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:24 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Add various utility codes needed for Quark MRC.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>
> ---
> There are 12 checkpatch warnings in this patch, which are:
>
> warning: arch/x86/cpu/quark/mrc_util.c,1446: Too many leading tabs - consider code refactoring
> warning: arch/x86/cpu/quark/mrc_util.c,1450: line over 80 characters
> ...
>
> Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have
> all the details on how Intel's MRC codes are actually written to play with
> the hardware. Trying to refactor them may lead to a non-working MRC codes.
> For the 'line over 80 characters' issue, we have to leave them as is now
> due to the 'Too many leading tabs ...', sigh.

The code looks fine for the most part - I only have nits.

I'm not keen on BIT though. See my comments and what improvements you
can make. It would be great to drop BIT.

Re the debug macros, I suppose they are OK to keep. U-Boot doesn't
have the concept of debug() for different categories or levels of
verbosity.

>
>  arch/x86/cpu/quark/hte.c      |  398 +++++++++++
>  arch/x86/cpu/quark/hte.h      |   44 ++
>  arch/x86/cpu/quark/mrc_util.c | 1499 +++++++++++++++++++++++++++++++++++++++++
>  arch/x86/cpu/quark/mrc_util.h |  153 +++++
>  4 files changed, 2094 insertions(+)
>  create mode 100644 arch/x86/cpu/quark/hte.c
>  create mode 100644 arch/x86/cpu/quark/hte.h
>  create mode 100644 arch/x86/cpu/quark/mrc_util.c
>  create mode 100644 arch/x86/cpu/quark/mrc_util.h
>
> diff --git a/arch/x86/cpu/quark/hte.c b/arch/x86/cpu/quark/hte.c
> new file mode 100644
> index 0000000..d813c9c
> --- /dev/null
> +++ b/arch/x86/cpu/quark/hte.c
> @@ -0,0 +1,398 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/

Remove training slash?

> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#include <common.h>
> +#include <asm/arch/mrc.h>
> +#include <asm/arch/msg_port.h>
> +#include "mrc_util.h"
> +#include "hte.h"
> +
> +/**
> + * This function enables HTE to detect all possible errors for

s/This function// globally

I'd suggest present tense, like "enable HTE to detect all possible errors for...

> + * the given training parameters (per-bit or full byte lane).
> + */
> +static void hte_enable_all_errors(void)
> +{
> +       msg_port_write(HTE, 0x000200A2, 0xFFFFFFFF);
> +       msg_port_write(HTE, 0x000200A3, 0x000000FF);
> +       msg_port_write(HTE, 0x000200A4, 0x00000000);

Lower case hex again.

> +}
> +
> +/**
> + * This function goes and reads the HTE register in order to find any error
> + *
> + * @return: The errors detected in the HTE status register
> + */
> +static u32 hte_check_errors(void)
> +{
> +       return msg_port_read(HTE, 0x000200A7);
> +}
> +
> +/**
> + * This function waits until HTE finishes
> + */
> +static void hte_wait_for_complete(void)
> +{
> +       u32 tmp;
> +
> +       ENTERFN();
> +
> +       do {} while ((msg_port_read(HTE, 0x00020012) & BIT30) != 0);
> +
> +       tmp = msg_port_read(HTE, 0x00020011);
> +       tmp |= BIT9;
> +       tmp &= ~(BIT12 | BIT13);
> +       msg_port_write(HTE, 0x00020011, tmp);
> +
> +       LEAVEFN();
> +}
> +
> +/**
> + * This function clears registers related with errors in the HTE
> + */
> +static void hte_clear_error_regs(void)
> +{
> +       u32 tmp;
> +
> +       /*
> +        * Clear all HTE errors and enable error checking
> +        * for burst and chunk.
> +        */
> +       tmp = msg_port_read(HTE, 0x000200A1);
> +       tmp |= BIT8;
> +       msg_port_write(HTE, 0x000200A1, tmp);
> +}
> +
> +/**
> + * This function executes basic single cache line memory write/read/verify
> + * test using simple constant pattern, different for READ_RAIN and

REAS_TRAIN?

> + * WRITE_TRAIN modes.
> + *
> + * See hte_basic_write_read() which is external visible wrapper.

the external (fix below also)

> + *
> + * @mrc_params: host struture for all MRC global data
> + * @addr: memory adress being tested (must hit specific channel/rank)
> + * @first_run: if set then hte registers are configured, otherwise it is

the hte?

> + *             assumed configuration is done and just re-run the test

assumed configuration is done the we just re-run the test
,
(fix below also)

> + * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern)
> + *
> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
> + */
> +static u16 hte_basic_data_cmp(struct mrc_params *mrc_params, u32 addr,
> +                             u8 first_run, u8 mode)
> +{
> +       u32 pattern;
> +       u32 offset;
> +
> +       if (first_run) {
> +               msg_port_write(HTE, 0x00020020, 0x01B10021);
> +               msg_port_write(HTE, 0x00020021, 0x06000000);
> +               msg_port_write(HTE, 0x00020022, addr >> 6);
> +               msg_port_write(HTE, 0x00020062, 0x00800015);
> +               msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
> +               msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
> +               msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
> +               msg_port_write(HTE, 0x00020061, 0x00030008);
> +
> +               if (mode == WRITE_TRAIN)
> +                       pattern = 0xC33C0000;
> +               else /* READ_TRAIN */
> +                       pattern = 0xAA5555AA;
> +
> +               for (offset = 0x80; offset <= 0x8F; offset++)
> +                       msg_port_write(HTE, offset, pattern);
> +       }
> +
> +       msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
> +       msg_port_write(HTE, 0x00020011, 0x00011000);
> +       msg_port_write(HTE, 0x00020011, 0x00011100);
> +
> +       hte_wait_for_complete();
> +
> +       /*
> +        * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for
> +        * any bytelane errors.
> +        */
> +       return (hte_check_errors() >> 8) & 0xFF;
> +}
> +
> +/**
> + * This function examines single cache line memory with write/read/verify
> + * test using multiple data patterns (victim-aggressor algorithm).
> + *
> + * See hte_write_stress_bit_lanes() which is external visible wrapper.
> + *
> + * @mrc_params: host struture for all MRC global data

structure

> + * @addr: memory adress being tested (must hit specific channel/rank)
> + * @loop_cnt: number of test iterations
> + * @seed_victim: victim data pattern seed
> + * @seed_aggressor: aggressor data pattern seed
> + * @victim_bit: should be 0 as auto rotate feature is in use

auto-rotate

> + * @first_run: if set then hte registers are configured, otherwise it is

Actually I wonder if HTE would be better than hte, which looks like a
'the' typo, particularly if you leave out 'the'. Also can you please
comment at the top of the file (first function) what HTE stands for)?

> + *             assumed configuration is done and just re-run the test
> + *
> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
> + */
> +static u16 hte_rw_data_cmp(struct mrc_params *mrc_params, u32 addr,
> +                          u8 loop_cnt, u32 seed_victim, u32 seed_aggressor,
> +                          u8 victim_bit, u8 first_run)
> +{
> +       u32 offset;
> +       u32 tmp;
> +
> +       if (first_run) {
> +               msg_port_write(HTE, 0x00020020, 0x00910024);
> +               msg_port_write(HTE, 0x00020023, 0x00810024);
> +               msg_port_write(HTE, 0x00020021, 0x06070000);
> +               msg_port_write(HTE, 0x00020024, 0x06070000);
> +               msg_port_write(HTE, 0x00020022, addr >> 6);
> +               msg_port_write(HTE, 0x00020025, addr >> 6);
> +               msg_port_write(HTE, 0x00020062, 0x0000002A);
> +               msg_port_write(HTE, 0x00020063, seed_victim);
> +               msg_port_write(HTE, 0x00020064, seed_aggressor);
> +               msg_port_write(HTE, 0x00020065, seed_victim);
> +
> +               /*
> +                * Write the pattern buffers to select the victim bit
> +                *
> +                * Start with bit0
> +                */
> +               for (offset = 0x80; offset <= 0x8F; offset++) {
> +                       if ((offset % 8) == victim_bit)
> +                               msg_port_write(HTE, offset, 0x55555555);
> +                       else
> +                               msg_port_write(HTE, offset, 0xCCCCCCCC);
> +               }
> +
> +               msg_port_write(HTE, 0x00020061, 0x00000000);
> +               msg_port_write(HTE, 0x00020066, 0x03440000);
> +               msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
> +       }
> +
> +       tmp = 0x10001000 | (loop_cnt << 16);
> +       msg_port_write(HTE, 0x00020011, tmp);
> +       msg_port_write(HTE, 0x00020011, tmp | BIT8);
> +
> +       hte_wait_for_complete();
> +
> +       /*
> +        * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for
> +        * any bytelane errors.
> +        */
> +       return (hte_check_errors() >> 8) & 0xFF;
> +}
> +
> +/**
> + * This function uses HW HTE engine to initialize or test all memory attached
> + * to a given DUNIT. If flag is MRC_MEM_INIT, this routine writes 0s to all
> + * memory locations to initialize ECC. If flag is MRC_MEM_TEST, this routine
> + * will send an 5AA55AA5 pattern to all memory locations on the RankMask and
> + * then read it back. Then it sends an A55AA55A pattern to all memory locations
> + * on the RankMask and reads it back.
> + *
> + * @mrc_params: host struture for all MRC global data
> + * @flag: MRC_MEM_INIT or MRC_MEM_TEST
> + *
> + * @return: errors register showing HTE failures. Also prints out which rank
> + *          failed the HTE test if failure occurs. For rank detection to work,
> + *          the address map must be left in its default state. If MRC changes
> + *          the address map, this function must be modified to change it back
> + *          to default at the beginning, then restore it at the end.
> + */
> +u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag)
> +{
> +       u32 offset;
> +       int test_num;
> +       int i;
> +
> +       /*
> +        * Clear out the error registers at the start of each memory
> +        * init or memory test run.
> +        */
> +       hte_clear_error_regs();
> +
> +       msg_port_write(HTE, 0x00020062, 0x00000015);
> +
> +       for (offset = 0x80; offset <= 0x8F; offset++)
> +               msg_port_write(HTE, offset, ((offset & 1) ? 0xA55A : 0x5AA5));
> +
> +       msg_port_write(HTE, 0x00020021, 0x00000000);
> +       msg_port_write(HTE, 0x00020022, (mrc_params->mem_size >> 6) - 1);
> +       msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
> +       msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
> +       msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
> +       msg_port_write(HTE, 0x00020066, 0x03000000);
> +
> +       switch (flag) {
> +       case MRC_MEM_INIT:
> +               /*
> +                * Only 1 write pass through memory is needed
> +                * to initialize ECC
> +                */
> +               test_num = 1;
> +               break;
> +       case MRC_MEM_TEST:
> +               /* Write/read then write/read with inverted pattern */
> +               test_num = 4;
> +               break;
> +       default:
> +               DPF(D_INFO, "Unknown parameter for flag: %d\n", flag);
> +               return 0xFFFFFFFF;
> +       }
> +
> +       DPF(D_INFO, "hte_mem_init");

debug()

> +
> +       for (i = 0; i < test_num; i++) {
> +               DPF(D_INFO, ".");
> +
> +               if (i == 0) {
> +                       msg_port_write(HTE, 0x00020061, 0x00000000);
> +                       msg_port_write(HTE, 0x00020020, 0x00110010);
> +               } else if (i == 1) {
> +                       msg_port_write(HTE, 0x00020061, 0x00000000);
> +                       msg_port_write(HTE, 0x00020020, 0x00010010);
> +               } else if (i == 2) {
> +                       msg_port_write(HTE, 0x00020061, 0x00010100);
> +                       msg_port_write(HTE, 0x00020020, 0x00110010);
> +               } else {
> +                       msg_port_write(HTE, 0x00020061, 0x00010100);
> +                       msg_port_write(HTE, 0x00020020, 0x00010010);
> +               }
> +
> +               msg_port_write(HTE, 0x00020011, 0x00111000);
> +               msg_port_write(HTE, 0x00020011, 0x00111100);
> +
> +               hte_wait_for_complete();
> +
> +               /* If this is a READ pass, check for errors at the end */
> +               if ((i % 2) == 1) {
> +                       /* Return immediately if error */
> +                       if (hte_check_errors())
> +                               break;
> +               }
> +       }
> +
> +       DPF(D_INFO, "done\n");
> +
> +       return hte_check_errors();
> +}
> +
> +/**
> + * This function executes basic single cache line memory write/read/verify

'executes a basic'

> + * test using simple constant pattern, different for READ_RAIN and
> + * WRITE_TRAIN modes.
> + *
> + * @mrc_params: host struture for all MRC global data

structure, please fix globally

> + * @addr: memory adress being tested (must hit specific channel/rank)
> + * @first_run: if set then hte registers are configured, otherwise it is
> + *             assumed configuration is done and just re-run the test
> + * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern)
> + *
> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
> + */
> +u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr,
> +                        u8 first_run, u8 mode)

Why do we use u8 for these? Would uint be good enough? Just a suggestion.

> +{
> +       u16 errors;
> +
> +       ENTERFN();
> +
> +       /* Enable all error reporting in preparation for HTE test */
> +       hte_enable_all_errors();
> +       hte_clear_error_regs();
> +
> +       errors = hte_basic_data_cmp(mrc_params, addr, first_run, mode);
> +
> +       LEAVEFN();
> +
> +       return errors;
> +}
> +
> +/**
> + * This function examines single cache line memory with write/read/verify

examines a single-cache-line memory

(at least I think this is what it is saying)

> + * test using multiple data patterns (victim-aggressor algorithm).
> + *
> + * @mrc_params: host struture for all MRC global data
> + * @addr: memory adress being tested (must hit specific channel/rank)
> + * @first_run: if set then hte registers are configured, otherwise it is
> + *             assumed configuration is done and just re-run the test
> + *
> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
> + */
> +u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params,
> +                              u32 addr, u8 first_run)
> +{
> +       u16 errors;
> +       u8 victim_bit = 0;
> +
> +       ENTERFN();
> +
> +       /* Enable all error reporting in preparation for HTE test */
> +       hte_enable_all_errors();
> +       hte_clear_error_regs();
> +
> +       /*
> +        * Loop through each bit in the bytelane.
> +        *
> +        * Each pass creates a victim bit while keeping all other bits the same
> +        * as aggressors. AVN HTE adds an auto-rotate feature which allows us
> +        * to program the entire victim/aggressor sequence in 1 step.

What is AVN?

> +        *
> +        * The victim bit rotates on each pass so no need to have software
> +        * implement a victim bit loop like on VLV.

VLV? I think it is sometimes better to write these out and put the
abbreviation in brackets after it, at least once in the file.

> +        */
> +       errors = hte_rw_data_cmp(mrc_params, addr, HTE_LOOP_CNT,
> +                                HTE_LFSR_VICTIM_SEED, HTE_LFSR_AGRESSOR_SEED,
> +                                victim_bit, first_run);
> +
> +       LEAVEFN();
> +
> +       return errors;
> +}
> +
> +/**
> + * This function execute basic single cache line memory write or read.

as above

> + * This is just for receive enable / fine write levelling purpose.

write-levelling (I think that's what you mean)

> + *
> + * @addr: memory adress being tested (must hit specific channel/rank)
> + * @first_run: if set then hte registers are configured, otherwise it is
> + *             assumed configuration is done and just re-run the test
> + * @is_write: when non-zero memory write operation executed, otherwise read
> + */
> +void hte_mem_op(u32 addr, u8 first_run, u8 is_write)
> +{
> +       u32 offset;
> +       u32 tmp;
> +
> +       hte_enable_all_errors();
> +       hte_clear_error_regs();
> +
> +       if (first_run) {
> +               tmp = is_write ? 0x01110021 : 0x01010021;
> +               msg_port_write(HTE, 0x00020020, tmp);
> +
> +               msg_port_write(HTE, 0x00020021, 0x06000000);
> +               msg_port_write(HTE, 0x00020022, addr >> 6);
> +               msg_port_write(HTE, 0x00020062, 0x00800015);
> +               msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
> +               msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
> +               msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
> +               msg_port_write(HTE, 0x00020061, 0x00030008);
> +
> +               for (offset = 0x80; offset <= 0x8F; offset++)
> +                       msg_port_write(HTE, offset, 0xC33C0000);
> +       }
> +
> +       msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
> +       msg_port_write(HTE, 0x00020011, 0x00011000);
> +       msg_port_write(HTE, 0x00020011, 0x00011100);
> +
> +       hte_wait_for_complete();
> +}
> diff --git a/arch/x86/cpu/quark/hte.h b/arch/x86/cpu/quark/hte.h
> new file mode 100644
> index 0000000..3a173ea
> --- /dev/null
> +++ b/arch/x86/cpu/quark/hte.h
> @@ -0,0 +1,44 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#ifndef _HTE_H_
> +#define _HTE_H_
> +
> +enum {
> +       MRC_MEM_INIT,
> +       MRC_MEM_TEST
> +};
> +
> +enum {
> +       READ_TRAIN,
> +       WRITE_TRAIN
> +};
> +
> +/*
> + * EXP_LOOP_CNT field of HTE_CMD_CTL
> + *
> + * This CANNOT be less than 4!
> + */
> +#define HTE_LOOP_CNT           5
> +
> +/* random seed for victim */
> +#define HTE_LFSR_VICTIM_SEED   0xF294BA21
> +
> +/* random seed for aggressor */
> +#define HTE_LFSR_AGRESSOR_SEED 0xEBA7492D
> +
> +u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag);
> +u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr,
> +                        u8 first_run, u8 mode);
> +u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params,
> +                              u32 addr, u8 first_run);
> +void hte_mem_op(u32 addr, u8 first_run, u8 is_write);

Can you move the comments from the .c to the .h for these exported functions?


> +
> +#endif /* _HTE_H_ */
> diff --git a/arch/x86/cpu/quark/mrc_util.c b/arch/x86/cpu/quark/mrc_util.c
> new file mode 100644
> index 0000000..1ae42d6
> --- /dev/null
> +++ b/arch/x86/cpu/quark/mrc_util.c
> @@ -0,0 +1,1499 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#include <common.h>
> +#include <asm/arch/device.h>
> +#include <asm/arch/mrc.h>
> +#include <asm/arch/msg_port.h>
> +#include "mrc_util.h"
> +#include "hte.h"
> +#include "smc.h"
> +
> +static const uint8_t vref_codes[64] = {
> +       /* lowest to highest */
> +       0x3F, 0x3E, 0x3D, 0x3C, 0x3B, 0x3A, 0x39, 0x38,
> +       0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x31, 0x30,
> +       0x2F, 0x2E, 0x2D, 0x2C, 0x2B, 0x2A, 0x29, 0x28,
> +       0x27, 0x26, 0x25, 0x24, 0x23, 0x22, 0x21, 0x20,
> +       0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
> +       0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
> +       0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
> +       0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F
> +};
> +
> +void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask)
> +{
> +       msg_port_write(unit, addr,
> +                      (msg_port_read(unit, addr) & ~(mask)) |
> +                      ((data) & (mask)));
> +}
> +
> +void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask)
> +{
> +       msg_port_alt_write(unit, addr,
> +                          (msg_port_alt_read(unit, addr) & ~(mask)) |
> +                          ((data) & (mask)));
> +}
> +
> +void mrc_post_code(uint8_t major, uint8_t minor)
> +{
> +       /* send message to UART */
> +       DPF(D_INFO, "POST: 0x%01x%02x\n", major, minor);
> +
> +       /* error check */
> +       if (major == 0xEE)
> +               hang();
> +}
> +
> +/* Delay number of nanoseconds */
> +void delay_n(uint32_t ns)
> +{
> +       /* 1000 MHz clock has 1ns period --> no conversion required */
> +       uint64_t final_tsc = rdtsc();

blank line here after declarations end

> +       final_tsc += ((get_tbclk_mhz() * ns) / 1000);
> +
> +       while (rdtsc() < final_tsc)
> +               ;
> +}
> +
> +/* Delay number of microseconds */
> +void delay_u(uint32_t ms)
> +{
> +       /* 64-bit math is not an option, just use loops */
> +       while (ms--)
> +               delay_n(1000);
> +}

Some day I suspect these could be pulled out into general x86
functions. Let's see if anything else needs them first.

> +
> +/* Select Memory Manager as the source for PRI interface */
> +void select_mem_mgr(void)
> +{
> +       u32 dco;
> +
> +       ENTERFN();
> +
> +       dco = msg_port_read(MEM_CTLR, DCO);
> +       dco &= ~BIT28;

~(1 << 28)

Ah but I see you are using this everywhere.

U-Boot tries to avoid defining this sort of thing. See some comments
below about this.

> +       msg_port_write(MEM_CTLR, DCO, dco);
> +
> +       LEAVEFN();
> +}
> +
> +/* Select HTE as the source for PRI interface */
> +void select_hte(void)
> +{
> +       u32 dco;
> +
> +       ENTERFN();
> +
> +       dco = msg_port_read(MEM_CTLR, DCO);
> +       dco |= BIT28;
> +       msg_port_write(MEM_CTLR, DCO, dco);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Send DRAM command
> + * data should be formated using DCMD_Xxxx macro or emrsXCommand structure
> + */
> +void dram_init_command(uint32_t data)
> +{
> +       pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_DATA_REG, data);
> +       pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_CTRL_EXT_REG, 0);
> +       msg_port_setup(MSG_OP_DRAM_INIT, MEM_CTLR, 0);
> +
> +       DPF(D_REGWR, "WR32 %03X %08X %08X\n", MEM_CTLR, 0, data);
> +}
> +
> +/* Send DRAM wake command using special MCU side-band WAKE opcode */
> +void dram_wake_command(void)
> +{
> +       ENTERFN();
> +
> +       msg_port_setup(MSG_OP_DRAM_WAKE, MEM_CTLR, 0);
> +
> +       LEAVEFN();
> +}
> +
> +void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane)
> +{
> +       /* send message to UART */
> +       DPF(D_INFO, "CH%01X RK%01X BL%01X\n", channel, rank, byte_lane);
> +}
> +
> +/*
> + * This function will program the RCVEN delays
> + *
> + * (currently doesn't comprehend rank)
> + */
> +void set_rcvn(uint8_t channel, uint8_t rank,
> +             uint8_t byte_lane, uint32_t pi_count)

reformat to 80cols. Should this or any other function in this file be static?

> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       DPF(D_TRN, "Rcvn ch%d rnk%d ln%d : pi=%03X\n",
> +           channel, rank, byte_lane, pi_count);
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * BL0 -> B01PTRCTL0[11:08] (0x0-0xF)
> +        * BL1 -> B01PTRCTL0[23:20] (0x0-0xF)
> +        */
> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       msk = (byte_lane & BIT0) ? (BIT23 | BIT22 | BIT21 | BIT20) :
> +               (BIT11 | BIT10 | BIT9 | BIT8);

Would this be better as:

(0xf << 20) | (0xf << 8)

It might be more meaningful also.

I really don't think these long strings of | are nice.

> +       temp = (byte_lane & BIT0) ? ((pi_count / HALF_CLK) << 20) :
> +               ((pi_count / HALF_CLK) << 8);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F)
> +        * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F)

lower case hex again

> +        */
> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
> +       temp = pi_count << 24;
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /*
> +        * DEADBAND
> +        * BL0/1 -> B01DBCTL1[08/11] (+1 select)
> +        * BL0/1 -> B01DBCTL1[02/05] (enable)
> +        */
> +       reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       msk = 0x00;
> +       temp = 0x00;
> +
> +       /* enable */
> +       msk |= (byte_lane & BIT0) ? (BIT5) : (BIT2);

Remove () around BIT5

> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
> +               temp |= msk;
> +
> +       /* select */
> +       msk |= (byte_lane & BIT0) ? (BIT11) : (BIT8);
> +       if (pi_count < EARLY_DB)
> +               temp |= msk;

These uses of BIT seem more useful to me.

Still it would be better to have #defines for the bits which actually
describe their meaning.

Maybe you don't know the meaning though...

> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check */
> +       if (pi_count > 0x3F) {
> +               training_message(channel, rank, byte_lane);
> +               mrc_post_code(0xEE, 0xE0);
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the current RCVEN delay on the given
> + * channel, rank, byte_lane as an absolute PI count.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * BL0 -> B01PTRCTL0[11:08] (0x0-0xF)
> +        * BL1 -> B01PTRCTL0[23:20] (0x0-0xF)
> +        */
> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= (byte_lane & BIT0) ? (20) : (8);
> +       temp &= 0xF;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = temp * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F)
> +        * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);

Please avoid () around simple constants. Put them in the #define/enum if needed.

> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 24;
> +       temp &= 0x3F;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count += temp;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the RDQS delays based on an absolute
> + * amount of PIs.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +void set_rdqs(uint8_t channel, uint8_t rank,
> +             uint8_t byte_lane, uint32_t pi_count)
> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +       DPF(D_TRN, "Rdqs ch%d rnk%d ln%d : pi=%03X\n",
> +           channel, rank, byte_lane, pi_count);
> +
> +       /*
> +        * PI (1/128 MCLK)
> +        * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47)
> +        * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1RXDQSPICODE) : (B0RXDQSPICODE);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       msk = (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0);
> +       temp = pi_count << 0;
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check (shouldn't go above 0x3F) */
> +       if (pi_count > 0x47) {
> +               training_message(channel, rank, byte_lane);
> +               mrc_post_code(0xEE, 0xE1);
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the current RDQS delay on the given
> + * channel, rank, byte_lane as an absolute PI count.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * PI (1/128 MCLK)
> +        * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47)
> +        * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1RXDQSPICODE) : (B0RXDQSPICODE);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = temp & 0x7F;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the WDQS delays based on an absolute
> + * amount of PIs.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +void set_wdqs(uint8_t channel, uint8_t rank,
> +             uint8_t byte_lane, uint32_t pi_count)
> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       DPF(D_TRN, "Wdqs ch%d rnk%d ln%d : pi=%03X\n",
> +           channel, rank, byte_lane, pi_count);
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * BL0 -> B01PTRCTL0[07:04] (0x0-0xF)
> +        * BL1 -> B01PTRCTL0[19:16] (0x0-0xF)
> +        */
> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       msk = (byte_lane & BIT0) ? (BIT19 | BIT18 | BIT17 | BIT16) :
> +               (BIT7 | BIT6 | BIT5 | BIT4);
> +       temp = pi_count / HALF_CLK;
> +       temp <<= (byte_lane & BIT0) ? (16) : (4);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F)
> +        * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16);
> +       temp = pi_count << 16;
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /*
> +        * DEADBAND
> +        * BL0/1 -> B01DBCTL1[07/10] (+1 select)
> +        * BL0/1 -> B01DBCTL1[01/04] (enable)
> +        */
> +       reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       msk = 0x00;
> +       temp = 0x00;
> +
> +       /* enable */
> +       msk |= (byte_lane & BIT0) ? (BIT4) : (BIT1);
> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
> +               temp |= msk;
> +
> +       /* select */
> +       msk |= (byte_lane & BIT0) ? (BIT10) : (BIT7);
> +       if (pi_count < EARLY_DB)
> +               temp |= msk;
> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check */
> +       if (pi_count > 0x3F) {
> +               training_message(channel, rank, byte_lane);
> +               mrc_post_code(0xEE, 0xE2);
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the amount of WDQS delay on the given
> + * channel, rank, byte_lane as an absolute PI count.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * BL0 -> B01PTRCTL0[07:04] (0x0-0xF)
> +        * BL1 -> B01PTRCTL0[19:16] (0x0-0xF)
> +        */
> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= (byte_lane & BIT0) ? (16) : (4);
> +       temp &= 0xF;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = (temp * HALF_CLK);
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F)
> +        * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 16;
> +       temp &= 0x3F;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count += temp;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the WDQ delays based on an absolute
> + * number of PIs.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +void set_wdq(uint8_t channel, uint8_t rank,
> +            uint8_t byte_lane, uint32_t pi_count)
> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       DPF(D_TRN, "Wdq ch%d rnk%d ln%d : pi=%03X\n",
> +           channel, rank, byte_lane, pi_count);
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * BL0 -> B01PTRCTL0[03:00] (0x0-0xF)
> +        * BL1 -> B01PTRCTL0[15:12] (0x0-0xF)
> +        */
> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       msk = (byte_lane & BIT0) ? (BIT15 | BIT14 | BIT13 | BIT12) :
> +               (BIT3 | BIT2 | BIT1 | BIT0);
> +       temp = pi_count / HALF_CLK;
> +       temp <<= (byte_lane & BIT0) ? (12) : (0);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F)
> +        * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       msk = (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
> +       temp = pi_count << 8;
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /*
> +        * DEADBAND
> +        * BL0/1 -> B01DBCTL1[06/09] (+1 select)
> +        * BL0/1 -> B01DBCTL1[00/03] (enable)
> +        */
> +       reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       msk = 0x00;
> +       temp = 0x00;
> +
> +       /* enable */
> +       msk |= (byte_lane & BIT0) ? (BIT3) : (BIT0);
> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
> +               temp |= msk;
> +
> +       /* select */
> +       msk |= (byte_lane & BIT0) ? (BIT9) : (BIT6);
> +       if (pi_count < EARLY_DB)
> +               temp |= msk;
> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check */
> +       if (pi_count > 0x3F) {
> +               training_message(channel, rank, byte_lane);
> +               mrc_post_code(0xEE, 0xE3);
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the amount of WDQ delay on the given
> + * channel, rank, byte_lane as an absolute PI count.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * BL0 -> B01PTRCTL0[03:00] (0x0-0xF)
> +        * BL1 -> B01PTRCTL0[15:12] (0x0-0xF)
> +        */
> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= (byte_lane & BIT0) ? (12) : (0);
> +       temp &= 0xF;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = (temp * HALF_CLK);
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F)
> +        * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F)
> +        */
> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
> +               (channel * DDRIODQ_CH_OFFSET));
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 8;
> +       temp &= 0x3F;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count += temp;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the WCMD delays based on an absolute
> + * number of PIs.
> + */
> +void set_wcmd(uint8_t channel, uint32_t pi_count)
> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * CMDPTRREG[11:08] (0x0-0xF)
> +        */
> +       reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = (BIT11 | BIT10 | BIT9 | BIT8);
> +       temp = pi_count / HALF_CLK;
> +       temp <<= 8;
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused)
> +        * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused)
> +        * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused)
> +        * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused)
> +        * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused)
> +        * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F)
> +        * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused)
> +        * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused)
> +        */
> +       reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
> +
> +       msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
> +               BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
> +               BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0);
> +
> +       temp = (pi_count << 24) | (pi_count << 16) |
> +               (pi_count << 8) | (pi_count << 0);
> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = CMDDLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);  /* PO */
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /*
> +        * DEADBAND
> +        * CMDCFGREG0[17] (+1 select)
> +        * CMDCFGREG0[16] (enable)
> +        */
> +       reg = CMDCFGREG0 + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = 0x00;
> +       temp = 0x00;
> +
> +       /* enable */
> +       msk |= BIT16;
> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
> +               temp |= msk;
> +
> +       /* select */
> +       msk |= BIT17;
> +       if (pi_count < EARLY_DB)
> +               temp |= msk;
> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check */
> +       if (pi_count > 0x3F)
> +               mrc_post_code(0xEE, 0xE4);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the amount of WCMD delay on the given
> + * channel as an absolute PI count.
> + */
> +uint32_t get_wcmd(uint8_t channel)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * CMDPTRREG[11:08] (0x0-0xF)
> +        */
> +       reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 8;
> +       temp &= 0xF;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = temp * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused)
> +        * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused)
> +        * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused)
> +        * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused)
> +        * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused)
> +        * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F)
> +        * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused)
> +        * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused)
> +        */
> +       reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 16;
> +       temp &= 0x3F;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count += temp;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the WCLK delays based on an absolute
> + * number of PIs.
> + */
> +void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count)
> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * CCPTRREG[15:12] -> CLK1 (0x0-0xF)
> +        * CCPTRREG[11:08] -> CLK0 (0x0-0xF)
> +        */
> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = (BIT15 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);

mask = 0xff00 is much better, isn't it?

> +       temp = ((pi_count / HALF_CLK) << 12) | ((pi_count / HALF_CLK) << 8);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F)
> +        * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F)
> +        */
> +       reg = (rank) ? (ECCB1DLLPICODER0) : (ECCB1DLLPICODER0);
> +       reg += (channel * DDRIOCCC_CH_OFFSET);
> +       msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
> +               BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);

Ick!

> +       temp = (pi_count << 16) | (pi_count << 8);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = (rank) ? (ECCB1DLLPICODER1) : (ECCB1DLLPICODER1);

Remove all (), and below. Please fix globally.

> +       reg += (channel * DDRIOCCC_CH_OFFSET);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = (rank) ? (ECCB1DLLPICODER2) : (ECCB1DLLPICODER2);
> +       reg += (channel * DDRIOCCC_CH_OFFSET);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = (rank) ? (ECCB1DLLPICODER3) : (ECCB1DLLPICODER3);
> +       reg += (channel * DDRIOCCC_CH_OFFSET);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /*
> +        * DEADBAND
> +        * CCCFGREG1[11:08] (+1 select)
> +        * CCCFGREG1[03:00] (enable)
> +        */
> +       reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = 0x00;
> +       temp = 0x00;
> +
> +       /* enable */
> +       msk |= (BIT3 | BIT2 | BIT1 | BIT0);
> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
> +               temp |= msk;
> +
> +       /* select */
> +       msk |= (BIT11 | BIT10 | BIT9 | BIT8);
> +       if (pi_count < EARLY_DB)
> +               temp |= msk;
> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check */
> +       if (pi_count > 0x3F)
> +               mrc_post_code(0xEE, 0xE5);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the amout of WCLK delay on the given
> + * channel, rank as an absolute PI count.
> + */
> +uint32_t get_wclk(uint8_t channel, uint8_t rank)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * CCPTRREG[15:12] -> CLK1 (0x0-0xF)
> +        * CCPTRREG[11:08] -> CLK0 (0x0-0xF)
> +        */
> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= (rank) ? (12) : (8);
> +       temp &= 0xF;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = temp * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F)
> +        * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F)
> +        */
> +       reg = (rank) ? (ECCB1DLLPICODER0) : (ECCB1DLLPICODER0);
> +       reg += (channel * DDRIOCCC_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= (rank) ? (16) : (8);
> +       temp &= 0x3F;
> +
> +       pi_count += temp;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the WCTL delays based on an absolute
> + * number of PIs.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count)
> +{
> +       uint32_t reg;
> +       uint32_t msk;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * CCPTRREG[31:28] (0x0-0xF)
> +        * CCPTRREG[27:24] (0x0-0xF)
> +        */
> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
> +       temp = ((pi_count / HALF_CLK) << 28) | ((pi_count / HALF_CLK) << 24);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* Adjust PI_COUNT */
> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
> +        */
> +       reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
> +       temp = (pi_count << 24);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = ECCB1DLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = ECCB1DLLPICODER2 + (channel * DDRIOCCC_CH_OFFSET);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +       reg = ECCB1DLLPICODER3 + (channel * DDRIOCCC_CH_OFFSET);
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /*
> +        * DEADBAND
> +        * CCCFGREG1[13:12] (+1 select)
> +        * CCCFGREG1[05:04] (enable)
> +        */
> +       reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET);
> +       msk = 0x00;
> +       temp = 0x00;
> +
> +       /* enable */
> +       msk |= (BIT5 | BIT4);
> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
> +               temp |= msk;
> +
> +       /* select */
> +       msk |= (BIT13 | BIT12);
> +       if (pi_count < EARLY_DB)
> +               temp |= msk;
> +
> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
> +
> +       /* error check */
> +       if (pi_count > 0x3F)
> +               mrc_post_code(0xEE, 0xE6);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the amount of WCTL delay on the given
> + * channel, rank as an absolute PI count.
> + *
> + * (currently doesn't comprehend rank)
> + */
> +uint32_t get_wctl(uint8_t channel, uint8_t rank)
> +{
> +       uint32_t reg;
> +       uint32_t temp;
> +       uint32_t pi_count;
> +
> +       ENTERFN();
> +
> +       /*
> +        * RDPTR (1/2 MCLK, 64 PIs)
> +        * CCPTRREG[31:28] (0x0-0xF)
> +        * CCPTRREG[27:24] (0x0-0xF)
> +        */
> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 24;
> +       temp &= 0xF;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count = temp * HALF_CLK;
> +
> +       /*
> +        * PI (1/64 MCLK, 1 PIs)
> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
> +        */
> +       reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);
> +       temp = msg_port_alt_read(DDRPHY, reg);
> +       temp >>= 24;
> +       temp &= 0x3F;
> +
> +       /* Adjust PI_COUNT */
> +       pi_count += temp;
> +
> +       LEAVEFN();
> +
> +       return pi_count;
> +}
> +
> +/*
> + * This function will program the internal Vref setting in a given
> + * byte lane in a given channel.
> + */
> +void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting)
> +{
> +       uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL);
> +
> +       ENTERFN();
> +
> +       DPF(D_TRN, "Vref ch%d ln%d : val=%03X\n",
> +           channel, byte_lane, setting);
> +
> +       mrc_alt_write_mask(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) +
> +               ((byte_lane >> 1) * DDRIODQ_BL_OFFSET)),
> +               (vref_codes[setting] << 2),
> +               (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
> +
> +       /*
> +        * need to wait ~300ns for Vref to settle
> +        * (check that this is necessary)
> +        */
> +       delay_n(300);
> +
> +       /* ??? may need to clear pointers ??? */
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return the internal Vref setting for the given
> + * channel, byte_lane.
> + */
> +uint32_t get_vref(uint8_t channel, uint8_t byte_lane)
> +{
> +       uint8_t j;
> +       uint32_t ret_val = sizeof(vref_codes) / 2;
> +       uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL);
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       temp = msg_port_alt_read(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) +
> +               ((byte_lane >> 1) * DDRIODQ_BL_OFFSET)));
> +       temp >>= 2;
> +       temp &= 0x3F;
> +
> +       for (j = 0; j < sizeof(vref_codes); j++) {
> +               if (vref_codes[j] == temp) {
> +                       ret_val = j;
> +                       break;
> +               }
> +       }
> +
> +       LEAVEFN();
> +
> +       return ret_val;
> +}
> +
> +/*
> + * This function will return a 32 bit address in the desired

32-bit

> + * channel and rank.
> + */
> +uint32_t get_addr(uint8_t channel, uint8_t rank)
> +{
> +       uint32_t offset = 0x02000000;   /* 32MB */
> +
> +       /* Begin product specific code */
> +       if (channel > 0) {
> +               DPF(D_ERROR, "ILLEGAL CHANNEL\n");
> +               DEAD_LOOP();
> +       }
> +
> +       if (rank > 1) {
> +               DPF(D_ERROR, "ILLEGAL RANK\n");
> +               DEAD_LOOP();
> +       }
> +
> +       /* use 256MB lowest density as per DRP == 0x0003 */
> +       offset += rank * (256 * 1024 * 1024);
> +
> +       return offset;
> +}
> +
> +/*
> + * This function will sample the DQTRAINSTS registers in the given
> + * channel/rank SAMPLE_SIZE times looking for a valid '0' or '1'.
> + *
> + * It will return an encoded DWORD in which each bit corresponds to

DWORD?

> + * the sampled value on the byte lane.
> + */
> +uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel,
> +                   uint8_t rank, bool rcvn)
> +{
> +       uint8_t j;      /* just a counter */
> +       uint8_t bl;     /* which BL in the module (always 2 per module) */
> +       uint8_t bl_grp; /* which BL module */
> +       /* byte lane divisor */

Maybe rename the variable so you can drop the comment?

> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +       uint32_t msk[2];        /* BLx in module */
> +       /* DQTRAINSTS register contents for each sample */
> +       uint32_t sampled_val[SAMPLE_SIZE];
> +       uint32_t num_0s;        /* tracks the number of '0' samples */
> +       uint32_t num_1s;        /* tracks the number of '1' samples */
> +       uint32_t ret_val = 0x00;        /* assume all '0' samples */
> +       uint32_t address = get_addr(channel, rank);
> +
> +       /* initialise msk[] */
> +       msk[0] = (rcvn) ? (BIT1) : (BIT9);      /* BL0 */
> +       msk[1] = (rcvn) ? (BIT0) : (BIT8);      /* BL1 */
> +
> +       /* cycle through each byte lane group */
> +       for (bl_grp = 0; bl_grp < (NUM_BYTE_LANES / bl_divisor) / 2; bl_grp++) {
> +               /* take SAMPLE_SIZE samples */
> +               for (j = 0; j < SAMPLE_SIZE; j++) {
> +                       hte_mem_op(address, mrc_params->first_run,
> +                                  rcvn ? 0 : 1);
> +                       mrc_params->first_run = 0;
> +
> +                       /*
> +                        * record the contents of the proper
> +                        * DQTRAINSTS register
> +                        */
> +                       sampled_val[j] = msg_port_alt_read(DDRPHY,
> +                               (DQTRAINSTS +
> +                               (bl_grp * DDRIODQ_BL_OFFSET) +
> +                               (channel * DDRIODQ_CH_OFFSET)));
> +               }
> +
> +               /*
> +                * look for a majority value (SAMPLE_SIZE / 2) + 1
> +                * on the byte lane and set that value in the corresponding
> +                * ret_val bit
> +                */
> +               for (bl = 0; bl < 2; bl++) {
> +                       num_0s = 0x00;  /* reset '0' tracker for byte lane */
> +                       num_1s = 0x00;  /* reset '1' tracker for byte lane */
> +                       for (j = 0; j < SAMPLE_SIZE; j++) {
> +                               if (sampled_val[j] & msk[bl])
> +                                       num_1s++;
> +                               else
> +                                       num_0s++;
> +                       }
> +               if (num_1s > num_0s)
> +                       ret_val |= (1 << (bl + (bl_grp * 2)));
> +               }
> +       }
> +
> +       /*
> +        * "ret_val.0" contains the status of BL0
> +        * "ret_val.1" contains the status of BL1
> +        * "ret_val.2" contains the status of BL2
> +        * etc.

This comment should go in @return in the function comment.

> +        */
> +       return ret_val;
> +}
> +
> +/* This function will find the rising edge transition on RCVN or WDQS */
> +void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[],
> +                     uint8_t channel, uint8_t rank, bool rcvn)
> +{
> +       bool all_edges_found;   /* determines stop condition */
> +       bool direction[NUM_BYTE_LANES]; /* direction indicator */
> +       uint8_t sample; /* sample counter */
> +       uint8_t bl;     /* byte lane counter */
> +       /* byte lane divisor */
> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +       uint32_t sample_result[SAMPLE_CNT];     /* results of sample_dqs() */
> +       uint32_t temp;
> +       uint32_t transition_pattern;
> +
> +       ENTERFN();
> +
> +       /* select hte and request initial configuration */
> +       select_hte();
> +       mrc_params->first_run = 1;
> +
> +       /* Take 3 sample points (T1,T2,T3) to obtain a transition pattern */
> +       for (sample = 0; sample < SAMPLE_CNT; sample++) {
> +               /* program the desired delays for sample */
> +               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                       /* increase sample delay by 26 PI (0.2 CLK) */
> +                       if (rcvn) {
> +                               set_rcvn(channel, rank, bl,
> +                                        delay[bl] + (sample * SAMPLE_DLY));
> +                       } else {
> +                               set_wdqs(channel, rank, bl,
> +                                        delay[bl] + (sample * SAMPLE_DLY));
> +                       }
> +               }
> +
> +               /* take samples (Tsample_i) */
> +               sample_result[sample] = sample_dqs(mrc_params,
> +                       channel, rank, rcvn);
> +
> +               DPF(D_TRN,
> +                   "Find rising edge %s ch%d rnk%d: #%d dly=%d dqs=%02X\n",
> +                   (rcvn ? "RCVN" : "WDQS"), channel, rank, sample,
> +                   sample * SAMPLE_DLY, sample_result[sample]);
> +       }
> +
> +       /*
> +        * This pattern will help determine where we landed and ultimately
> +        * how to place RCVEN/WDQS.
> +        */
> +       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +               /* build transition_pattern (MSB is 1st sample) */
> +               transition_pattern = 0;
> +               for (sample = 0; sample < SAMPLE_CNT; sample++) {
> +                       transition_pattern |=
> +                               ((sample_result[sample] & (1 << bl)) >> bl) <<
> +                               (SAMPLE_CNT - 1 - sample);
> +               }
> +
> +               DPF(D_TRN, "=== transition pattern %d\n", transition_pattern);
> +
> +               /*
> +                * set up to look for rising edge based on
> +                * transition_pattern
> +                */
> +               switch (transition_pattern) {
> +               case 0: /* sampled 0->0->0 */
> +                       /* move forward from T3 looking for 0->1 */
> +                       delay[bl] += 2 * SAMPLE_DLY;
> +                       direction[bl] = FORWARD;
> +                       break;
> +               case 1: /* sampled 0->0->1 */
> +               case 5: /* sampled 1->0->1 (bad duty cycle) *HSD#237503* */
> +                       /* move forward from T2 looking for 0->1 */
> +                       delay[bl] += 1 * SAMPLE_DLY;
> +                       direction[bl] = FORWARD;
> +                       break;
> +               case 2: /* sampled 0->1->0 (bad duty cycle) *HSD#237503* */
> +               case 3: /* sampled 0->1->1 */
> +                       /* move forward from T1 looking for 0->1 */
> +                       delay[bl] += 0 * SAMPLE_DLY;
> +                       direction[bl] = FORWARD;
> +                       break;
> +               case 4: /* sampled 1->0->0 (assumes BL8, HSD#234975) */
> +                       /* move forward from T3 looking for 0->1 */
> +                       delay[bl] += 2 * SAMPLE_DLY;
> +                       direction[bl] = FORWARD;
> +                       break;
> +               case 6: /* sampled 1->1->0 */
> +               case 7: /* sampled 1->1->1 */
> +                       /* move backward from T1 looking for 1->0 */
> +                       delay[bl] += 0 * SAMPLE_DLY;
> +                       direction[bl] = BACKWARD;
> +                       break;
> +               default:
> +                       mrc_post_code(0xEE, 0xEE);
> +                       break;
> +               }
> +
> +               /* program delays */
> +               if (rcvn)
> +                       set_rcvn(channel, rank, bl, delay[bl]);
> +               else
> +                       set_wdqs(channel, rank, bl, delay[bl]);
> +       }
> +
> +       /*
> +        * Based on the observed transition pattern on the byte lane,
> +        * begin looking for a rising edge with single PI granularity.
> +        */
> +       do {
> +               all_edges_found = true; /* assume all byte lanes passed */
> +               /* take a sample */
> +               temp = sample_dqs(mrc_params, channel, rank, rcvn);
> +               /* check all each byte lane for proper edge */
> +               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                       if (temp & (1 << bl)) {
> +                               /* sampled "1" */
> +                               if (direction[bl] == BACKWARD) {
> +                                       /*
> +                                        * keep looking for edge
> +                                        * on this byte lane
> +                                        */
> +                                       all_edges_found = false;
> +                                       delay[bl] -= 1;
> +                                       if (rcvn) {
> +                                               set_rcvn(channel, rank,
> +                                                        bl, delay[bl]);
> +                                       } else {
> +                                               set_wdqs(channel, rank,
> +                                                        bl, delay[bl]);
> +                                       }
> +                               }
> +                       } else {
> +                               /* sampled "0" */
> +                               if (direction[bl] == FORWARD) {
> +                                       /*
> +                                        * keep looking for edge
> +                                        * on this byte lane
> +                                        */
> +                                       all_edges_found = false;
> +                                       delay[bl] += 1;
> +                                       if (rcvn) {
> +                                               set_rcvn(channel, rank,
> +                                                        bl, delay[bl]);
> +                                       } else {
> +                                               set_wdqs(channel, rank,
> +                                                        bl, delay[bl]);
> +                                       }
> +                               }
> +                       }
> +               }
> +       } while (!all_edges_found);
> +
> +       /* restore DDR idle state */
> +       dram_init_command(DCMD_PREA(rank));
> +
> +       DPF(D_TRN, "Delay %03X %03X %03X %03X\n",
> +           delay[0], delay[1], delay[2], delay[3]);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will return a 32 bit mask that will be used to
> + * check for byte lane failures.
> + */
> +uint32_t byte_lane_mask(struct mrc_params *mrc_params)
> +{
> +       uint32_t j;
> +       uint32_t ret_val = 0x00;
> +
> +       /*
> +        * set ret_val based on NUM_BYTE_LANES such that you will check
> +        * only BL0 in result
> +        *
> +        * (each bit in result represents a byte lane)
> +        */
> +       for (j = 0; j < MAX_BYTE_LANES; j += NUM_BYTE_LANES)
> +               ret_val |= (1 << ((j / NUM_BYTE_LANES) * NUM_BYTE_LANES));
> +
> +       /*
> +        * HSD#235037
> +        * need to adjust the mask for 16-bit mode
> +        */
> +       if (mrc_params->channel_width == X16)
> +               ret_val |= (ret_val << 2);
> +
> +       return ret_val;
> +}
> +
> +/*
> + * Check memory executing simple write/read/verify at the specified address.
> + *
> + * Bits in the result indicate failure on specific byte lane.
> + */
> +uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address)
> +{
> +       uint32_t result = 0;
> +       uint8_t first_run = 0;
> +
> +       if (mrc_params->hte_setup) {
> +               mrc_params->hte_setup = 0;
> +               first_run = 1;
> +               select_hte();
> +       }
> +
> +       result = hte_basic_write_read(mrc_params, address,
> +                                     first_run, WRITE_TRAIN);

reformat to 80cols

> +
> +       DPF(D_TRN, "check_rw_coarse result is %x\n", result);
> +
> +       return result;
> +}
> +
> +/*
> + * Check memory executing write/read/verify of many data patterns
> + * at the specified address. Bits in the result indicate failure
> + * on specific byte lane.
> + */
> +uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address)
> +{
> +       uint32_t result;
> +       uint8_t first_run = 0;
> +
> +       if (mrc_params->hte_setup) {
> +               mrc_params->hte_setup = 0;
> +               first_run = 1;
> +               select_hte();
> +       }
> +
> +       result = hte_write_stress_bit_lanes(mrc_params, address, first_run);
> +
> +       DPF(D_TRN, "check_bls_ex result is %x\n", result);
> +
> +       return result;
> +}
> +
> +/*
> + * 32-bit LFSR with characteristic polynomial: X^32 + X^22 +X^2 + X^1
> + *
> + * The function takes pointer to previous 32 bit value and
> + * modifies it to next value.
> + */
> +void lfsr32(uint32_t *lfsr_ptr)
> +{
> +       uint32_t bit;
> +       uint32_t lfsr;
> +       int i;
> +
> +       lfsr = *lfsr_ptr;
> +
> +       for (i = 0; i < 32; i++) {
> +               bit = 1 ^ (lfsr & BIT0);
> +               bit = bit ^ ((lfsr & BIT1) >> 1);
> +               bit = bit ^ ((lfsr & BIT2) >> 2);
> +               bit = bit ^ ((lfsr & BIT22) >> 22);
> +
> +               lfsr = ((lfsr >> 1) | (bit << 31));
> +       }
> +
> +       *lfsr_ptr = lfsr;
> +}
> +
> +/* Clear the pointers in a given byte lane in a given channel */
> +void clear_pointers(void)
> +{
> +       uint8_t channel;
> +       uint8_t bl;
> +
> +       ENTERFN();
> +
> +       for (channel = 0; channel < NUM_CHANNELS; channel++) {
> +               for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
> +                       mrc_alt_write_mask(DDRPHY,
> +                                          (B01PTRCTL1 +
> +                                          (channel * DDRIODQ_CH_OFFSET) +
> +                                          ((bl >> 1) * DDRIODQ_BL_OFFSET)),
> +                                          ~BIT8, BIT8);
> +
> +                       mrc_alt_write_mask(DDRPHY,
> +                                          (B01PTRCTL1 +
> +                                          (channel * DDRIODQ_CH_OFFSET) +
> +                                          ((bl >> 1) * DDRIODQ_BL_OFFSET)),
> +                                          BIT8, BIT8);
> +               }
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +void print_timings(struct mrc_params *mrc_params)
> +{
> +       uint8_t algo;
> +       uint8_t channel;
> +       uint8_t rank;
> +       uint8_t bl;
> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +
> +       DPF(D_INFO, "\n---------------------------");
> +       DPF(D_INFO, "\nALGO[CH:RK] BL0 BL1 BL2 BL3");
> +       DPF(D_INFO, "\n===========================");
> +
> +       for (algo = 0; algo < MAX_ALGOS; algo++) {
> +               for (channel = 0; channel < NUM_CHANNELS; channel++) {
> +                       if (mrc_params->channel_enables & (1 << channel)) {
> +                               for (rank = 0; rank < NUM_RANKS; rank++) {

Can we put this block in its own function to fix the over-indenting?

> +                                       if (mrc_params->rank_enables &
> +                                               (1 << rank)) {
> +                                               switch (algo) {
> +                                               case RCVN:
> +                                                       DPF(D_INFO,
> +                                                           "\nRCVN[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case WDQS:
> +                                                       DPF(D_INFO,
> +                                                           "\nWDQS[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case WDQX:
> +                                                       DPF(D_INFO,
> +                                                           "\nWDQx[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case RDQS:
> +                                                       DPF(D_INFO,
> +                                                           "\nRDQS[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case VREF:
> +                                                       DPF(D_INFO,
> +                                                           "\nVREF[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case WCMD:
> +                                                       DPF(D_INFO,
> +                                                           "\nWCMD[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case WCTL:
> +                                                       DPF(D_INFO,
> +                                                           "\nWCTL[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               case WCLK:
> +                                                       DPF(D_INFO,
> +                                                           "\nWCLK[%02d:%02d]",
> +                                                           channel, rank);
> +                                                       break;
> +                                               default:
> +                                                       break;
> +                                               }
> +
> +                                               for (bl = 0;
> +                                                    bl < (NUM_BYTE_LANES / bl_divisor);
> +                                                    bl++) {
> +                                                       switch (algo) {
> +                                                       case RCVN:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_rcvn(channel, rank, bl));
> +                                                               break;
> +                                                       case WDQS:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_wdqs(channel, rank, bl));
> +                                                               break;
> +                                                       case WDQX:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_wdq(channel, rank, bl));
> +                                                               break;
> +                                                       case RDQS:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_rdqs(channel, rank, bl));
> +                                                               break;
> +                                                       case VREF:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_vref(channel, bl));
> +                                                               break;
> +                                                       case WCMD:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_wcmd(channel));
> +                                                               break;
> +                                                       case WCTL:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_wctl(channel, rank));
> +                                                               break;
> +                                                       case WCLK:
> +                                                               DPF(D_INFO,
> +                                                                   " %03d",
> +                                                                   get_wclk(channel, rank));
> +                                                               break;
> +                                                       default:
> +                                                               break;
> +                                                       }
> +                                               }
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +
> +       DPF(D_INFO, "\n---------------------------");
> +       DPF(D_INFO, "\n");
> +}
> diff --git a/arch/x86/cpu/quark/mrc_util.h b/arch/x86/cpu/quark/mrc_util.h
> new file mode 100644
> index 0000000..edbe219
> --- /dev/null
> +++ b/arch/x86/cpu/quark/mrc_util.h
> @@ -0,0 +1,153 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#ifndef _MRC_UTIL_H_
> +#define _MRC_UTIL_H_
> +
> +/* Turn on this macro to enable MRC debugging output */
> +#undef  MRC_DEBUG
> +
> +/* MRC Debug Support */
> +#define DPF            debug_cond
> +
> +/* debug print type */
> +
> +#ifdef MRC_DEBUG
> +#define D_ERROR                0x0001
> +#define D_INFO         0x0002
> +#define D_REGRD                0x0004
> +#define D_REGWR                0x0008
> +#define D_FCALL                0x0010
> +#define D_TRN          0x0020
> +#define D_TIME         0x0040
> +#else
> +#define D_ERROR                0
> +#define D_INFO         0
> +#define D_REGRD                0
> +#define D_REGWR                0
> +#define D_FCALL                0
> +#define D_TRN          0
> +#define D_TIME         0
> +#endif
> +
> +#define ENTERFN(...)   debug_cond(D_FCALL, "<%s>\n", __func__)
> +#define LEAVEFN(...)   debug_cond(D_FCALL, "</%s>\n", __func__)
> +#define REPORTFN(...)  debug_cond(D_FCALL, "<%s/>\n", __func__)
> +
> +/* Generic Register Bits */
> +#define BIT0           0x00000001
> +#define BIT1           0x00000002
> +#define BIT2           0x00000004
> +#define BIT3           0x00000008
> +#define BIT4           0x00000010
> +#define BIT5           0x00000020
> +#define BIT6           0x00000040
> +#define BIT7           0x00000080
> +#define BIT8           0x00000100
> +#define BIT9           0x00000200
> +#define BIT10          0x00000400
> +#define BIT11          0x00000800
> +#define BIT12          0x00001000
> +#define BIT13          0x00002000
> +#define BIT14          0x00004000
> +#define BIT15          0x00008000
> +#define BIT16          0x00010000
> +#define BIT17          0x00020000
> +#define BIT18          0x00040000
> +#define BIT19          0x00080000
> +#define BIT20          0x00100000
> +#define BIT21          0x00200000
> +#define BIT22          0x00400000
> +#define BIT23          0x00800000
> +#define BIT24          0x01000000
> +#define BIT25          0x02000000
> +#define BIT26          0x04000000
> +#define BIT27          0x08000000
> +#define BIT28          0x10000000
> +#define BIT29          0x20000000
> +#define BIT30          0x40000000
> +#define BIT31          0x80000000
> +
> +/* Message Bus Port */
> +#define MEM_CTLR       0x01
> +#define HOST_BRIDGE    0x03
> +#define MEM_MGR                0x05
> +#define HTE            0x11
> +#define DDRPHY         0x12
> +
> +/* number of sample points */
> +#define SAMPLE_CNT     3
> +/* number of PIs to increment per sample */
> +#define SAMPLE_DLY     26
> +
> +enum {
> +       /* indicates to decrease delays when looking for edge */
> +       BACKWARD,
> +       /* indicates to increase delays when looking for edge */
> +       FORWARD
> +};
> +
> +enum {
> +       RCVN,
> +       WDQS,
> +       WDQX,
> +       RDQS,
> +       VREF,
> +       WCMD,
> +       WCTL,
> +       WCLK,
> +       MAX_ALGOS,
> +};
> +
> +void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask);
> +void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask);
> +void mrc_post_code(uint8_t major, uint8_t minor);
> +void delay_n(uint32_t ns);
> +void delay_u(uint32_t ms);
> +void select_mem_mgr(void);
> +void select_hte(void);
> +void dram_init_command(uint32_t data);
> +void dram_wake_command(void);
> +void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane);
> +
> +void set_rcvn(uint8_t channel, uint8_t rank,
> +             uint8_t byte_lane, uint32_t pi_count);
> +uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane);
> +void set_rdqs(uint8_t channel, uint8_t rank,
> +             uint8_t byte_lane, uint32_t pi_count);
> +uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane);
> +void set_wdqs(uint8_t channel, uint8_t rank,
> +             uint8_t byte_lane, uint32_t pi_count);
> +uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane);
> +void set_wdq(uint8_t channel, uint8_t rank,
> +            uint8_t byte_lane, uint32_t pi_count);
> +uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane);
> +void set_wcmd(uint8_t channel, uint32_t pi_count);
> +uint32_t get_wcmd(uint8_t channel);
> +void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count);
> +uint32_t get_wclk(uint8_t channel, uint8_t rank);
> +void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count);
> +uint32_t get_wctl(uint8_t channel, uint8_t rank);
> +void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting);
> +uint32_t get_vref(uint8_t channel, uint8_t byte_lane);
> +
> +uint32_t get_addr(uint8_t channel, uint8_t rank);
> +uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel,
> +                   uint8_t rank, bool rcvn);
> +void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[],
> +                     uint8_t channel, uint8_t rank, bool rcvn);
> +uint32_t byte_lane_mask(struct mrc_params *mrc_params);
> +uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address);
> +uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address);
> +void lfsr32(uint32_t *lfsr_ptr);
> +void clear_pointers(void);
> +void print_timings(struct mrc_params *mrc_params);

If these are all truly exported, can we please put the function
comments here in the header file?

> +
> +#endif /* _MRC_UTIL_H_ */
> --
> 1.8.2.1
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support Bin Meng
@ 2015-02-04 16:24   ` Simon Glass
  2015-02-05 15:17     ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:24 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> The codes are actually doing the memory initialization stuff.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>
> ---
> The most ugly codes I've ever seen ...
> There are 252 warnings and 127 checks in this patch, which are:
>
> check: arch/x86/cpu/quark/smc.c,1609: Alignment should match open parenthesis
> warning: arch/x86/cpu/quark/smc.c,1610: line over 80 characters
> warning: arch/x86/cpu/quark/smc.c,1633: Too many leading tabs - consider code refactoring
> ...
>
> Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have
> all the details on how Intel's MRC codes are actually written to play with
> the hardware. Trying to refactor them may lead to a non-working MRC codes.
> For the 'line over 80 characters' issue, we have to leave them as is now
> due to the 'Too many leading tabs ...'. If I am trying to fix the 'Alignment
> should match open parenthesis' issue, I may end up adding more 'line over 80
> characters' issues, so we have to bear with it. Sigh.

Understood. Will try to limit my comments.

>
>  arch/x86/cpu/quark/smc.c | 2764 ++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/cpu/quark/smc.h |  446 ++++++++
>  2 files changed, 3210 insertions(+)
>  create mode 100644 arch/x86/cpu/quark/smc.c
>  create mode 100644 arch/x86/cpu/quark/smc.h
>
> diff --git a/arch/x86/cpu/quark/smc.c b/arch/x86/cpu/quark/smc.c
> new file mode 100644
> index 0000000..fb389cd
> --- /dev/null
> +++ b/arch/x86/cpu/quark/smc.c
> @@ -0,0 +1,2764 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#include <common.h>
> +#include <pci.h>
> +#include <asm/arch/device.h>
> +#include <asm/arch/mrc.h>
> +#include <asm/arch/msg_port.h>
> +#include "mrc_util.h"
> +#include "hte.h"
> +#include "smc.h"
> +
> +/* t_rfc values (in picoseconds) per density */
> +static const uint32_t t_rfc[5] = {
> +       90000,  /* 512Mb */
> +       110000, /* 1Gb */
> +       160000, /* 2Gb */
> +       300000, /* 4Gb */
> +       350000, /* 8Gb */
> +};
> +
> +/* t_ck clock period in picoseconds per speed index 800, 1066, 1333 */
> +static const uint32_t t_ck[3] = {
> +       2500,
> +       1875,
> +       1500
> +};
> +
> +/* Global variables */
> +static const uint16_t ddr_wclk[] = {193, 158};
> +static const uint16_t ddr_wctl[] = {1, 217};
> +static const uint16_t ddr_wcmd[] = {1, 220};
> +
> +#ifdef BACKUP_RCVN
> +static const uint16_t ddr_rcvn[] = {129, 498};
> +#endif
> +
> +#ifdef BACKUP_WDQS
> +static const uint16_t ddr_wdqs[] = {65, 289};
> +#endif
> +
> +#ifdef BACKUP_RDQS
> +static const uint8_t ddr_rdqs[] = {32, 24};
> +#endif
> +
> +#ifdef BACKUP_WDQ
> +static const uint16_t ddr_wdq[] = {32, 257};
> +#endif
> +
> +/* Stop self refresh driven by MCU */
> +void clear_self_refresh(struct mrc_params *mrc_params)
> +{
> +       ENTERFN();
> +
> +       /* clear the PMSTS Channel Self Refresh bits */
> +       mrc_write_mask(MEM_CTLR, PMSTS, BIT0, BIT0);
> +
> +       LEAVEFN();
> +}
> +
> +/* It will initialise timing registers in the MCU (DTR0..DTR4) */
> +void prog_ddr_timing_control(struct mrc_params *mrc_params)
> +{
> +       uint8_t tcl, wl;
> +       uint8_t trp, trcd, tras, twr, twtr, trrd, trtp, tfaw;
> +       uint32_t tck;
> +       u32 dtr0, dtr1, dtr2, dtr3, dtr4;
> +       u32 tmp1, tmp2;
> +
> +       ENTERFN();
> +
> +       /* mcu_init starts */
> +       mrc_post_code(0x02, 0x00);
> +
> +       dtr0 = msg_port_read(MEM_CTLR, DTR0);
> +       dtr1 = msg_port_read(MEM_CTLR, DTR1);
> +       dtr2 = msg_port_read(MEM_CTLR, DTR2);
> +       dtr3 = msg_port_read(MEM_CTLR, DTR3);
> +       dtr4 = msg_port_read(MEM_CTLR, DTR4);
> +
> +       tck = t_ck[mrc_params->ddr_speed];      /* Clock in picoseconds */
> +       tcl = mrc_params->params.cl;            /* CAS latency in clocks */
> +       trp = tcl;      /* Per CAT MRC */
> +       trcd = tcl;     /* Per CAT MRC */
> +       tras = MCEIL(mrc_params->params.ras, tck);
> +
> +       /* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */
> +       twr = MCEIL(15000, tck);
> +
> +       twtr = MCEIL(mrc_params->params.wtr, tck);
> +       trrd = MCEIL(mrc_params->params.rrd, tck);
> +       trtp = 4;       /* Valid for 800 and 1066, use 5 for 1333 */
> +       tfaw = MCEIL(mrc_params->params.faw, tck);
> +
> +       wl = 5 + mrc_params->ddr_speed;
> +
> +       dtr0 &= ~(BIT0 | BIT1);
> +       dtr0 |= mrc_params->ddr_speed;
> +       dtr0 &= ~(BIT12 | BIT13 | BIT14);
> +       tmp1 = tcl - 5;
> +       dtr0 |= ((tcl - 5) << 12);
> +       dtr0 &= ~(BIT4 | BIT5 | BIT6 | BIT7);
> +       dtr0 |= ((trp - 5) << 4);       /* 5 bit DRAM Clock */
> +       dtr0 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
> +       dtr0 |= ((trcd - 5) << 8);      /* 5 bit DRAM Clock */
> +
> +       dtr1 &= ~(BIT0 | BIT1 | BIT2);
> +       tmp2 = wl - 3;
> +       dtr1 |= (wl - 3);
> +       dtr1 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
> +       dtr1 |= ((wl + 4 + twr - 14) << 8);     /* Change to tWTP */
> +       dtr1 &= ~(BIT28 | BIT29 | BIT30);
> +       dtr1 |= ((MMAX(trtp, 4) - 3) << 28);    /* 4 bit DRAM Clock */
> +       dtr1 &= ~(BIT24 | BIT25);
> +       dtr1 |= ((trrd - 4) << 24);             /* 4 bit DRAM Clock */
> +       dtr1 &= ~(BIT4 | BIT5);
> +       dtr1 |= (1 << 4);
> +       dtr1 &= ~(BIT20 | BIT21 | BIT22 | BIT23);
> +       dtr1 |= ((tras - 14) << 20);            /* 6 bit DRAM Clock */
> +       dtr1 &= ~(BIT16 | BIT17 | BIT18 | BIT19);
> +       dtr1 |= ((((tfaw + 1) >> 1) - 5) << 16);/* 4 bit DRAM Clock */
> +       /* Set 4 Clock CAS to CAS delay (multi-burst) */
> +       dtr1 &= ~(BIT12 | BIT13);
> +
> +       dtr2 &= ~(BIT0 | BIT1 | BIT2);
> +       dtr2 |= 1;
> +       dtr2 &= ~(BIT8 | BIT9 | BIT10);
> +       dtr2 |= (2 << 8);
> +       dtr2 &= ~(BIT16 | BIT17 | BIT18 | BIT19);
> +       dtr2 |= (2 << 16);
> +
> +       dtr3 &= ~(BIT0 | BIT1 | BIT2);
> +       dtr3 |= 2;
> +       dtr3 &= ~(BIT4 | BIT5 | BIT6);
> +       dtr3 |= (2 << 4);
> +
> +       dtr3 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
> +       if (mrc_params->ddr_speed == DDRFREQ_800) {
> +               /* Extended RW delay (+1) */
> +               dtr3 |= ((tcl - 5 + 1) << 8);
> +       } else if (mrc_params->ddr_speed == DDRFREQ_1066) {
> +               /* Extended RW delay (+1) */
> +               dtr3 |= ((tcl - 5 + 1) << 8);
> +       }
> +
> +       dtr3 &= ~(BIT13 | BIT14 | BIT15 | BIT16);
> +       dtr3 |= ((4 + wl + twtr - 11) << 13);
> +
> +       dtr3 &= ~(BIT22 | BIT23);
> +       if (mrc_params->ddr_speed == DDRFREQ_800)
> +               dtr3 |= ((MMAX(0, 1 - 1)) << 22);
> +       else
> +               dtr3 |= ((MMAX(0, 2 - 1)) << 22);
> +
> +       dtr4 &= ~(BIT0 | BIT1);
> +       dtr4 |= 1;
> +       dtr4 &= ~(BIT4 | BIT5 | BIT6);
> +       dtr4 |= (1 << 4);
> +       dtr4 &= ~(BIT8 | BIT9 | BIT10);
> +       dtr4 |= ((1 + tmp1 - tmp2 + 2) << 8);
> +       dtr4 &= ~(BIT12 | BIT13 | BIT14);
> +       dtr4 |= ((1 + tmp1 - tmp2 + 2) << 12);
> +       dtr4 &= ~(BIT15 | BIT16);
> +
> +       msg_port_write(MEM_CTLR, DTR0, dtr0);
> +       msg_port_write(MEM_CTLR, DTR1, dtr1);
> +       msg_port_write(MEM_CTLR, DTR2, dtr2);
> +       msg_port_write(MEM_CTLR, DTR3, dtr3);
> +       msg_port_write(MEM_CTLR, DTR4, dtr4);

This bit stuff is a mess. It obscures the meaning IMO and we would be
much better off with proper named #defines. What can we do here?

> +
> +       LEAVEFN();
> +}
> +
> +/* Configure MCU before jedec init sequence */
> +void prog_decode_before_jedec(struct mrc_params *mrc_params)
> +{
> +       u32 drp;
> +       u32 drfc;
> +       u32 dcal;
> +       u32 dsch;
> +       u32 dpmc0;
> +
> +       ENTERFN();
> +
> +       /* Disable power saving features */
> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
> +       dpmc0 |= (BIT24 | BIT25);
> +       dpmc0 &= ~(BIT16 | BIT17 | BIT18);
> +       dpmc0 &= ~BIT23;
> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
> +
> +       /* Disable out of order transactions */
> +       dsch = msg_port_read(MEM_CTLR, DSCH);
> +       dsch |= (BIT8 | BIT12);
> +       msg_port_write(MEM_CTLR, DSCH, dsch);
> +
> +       /* Disable issuing the REF command */
> +       drfc = msg_port_read(MEM_CTLR, DRFC);
> +       drfc &= ~(BIT12 | BIT13 | BIT14);
> +       msg_port_write(MEM_CTLR, DRFC, drfc);
> +
> +       /* Disable ZQ calibration short */
> +       dcal = msg_port_read(MEM_CTLR, DCAL);
> +       dcal &= ~(BIT8 | BIT9 | BIT10);
> +       dcal &= ~(BIT12 | BIT13);
> +       msg_port_write(MEM_CTLR, DCAL, dcal);
> +
> +       /*
> +        * Training performed in address mode 0, rank population has limited
> +        * impact, however simulator complains if enabled non-existing rank.
> +        */
> +       drp = 0;
> +       if (mrc_params->rank_enables & 1)
> +               drp |= BIT0;
> +       if (mrc_params->rank_enables & 2)
> +               drp |= BIT1;
> +       msg_port_write(MEM_CTLR, DRP, drp);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * After Cold Reset, BIOS should set COLDWAKE bit to 1 before
> + * sending the WAKE message to the Dunit.
> + *
> + * For Standby Exit, or any other mode in which the DRAM is in
> + * SR, this bit must be set to 0.
> + */
> +void perform_ddr_reset(struct mrc_params *mrc_params)
> +{
> +       ENTERFN();
> +
> +       /* Set COLDWAKE bit before sending the WAKE message */
> +       mrc_write_mask(MEM_CTLR, DRMC, BIT16, BIT16);
> +
> +       /* Send wake command to DUNIT (MUST be done before JEDEC) */
> +       dram_wake_command();
> +
> +       /* Set default value */
> +       msg_port_write(MEM_CTLR, DRMC,
> +                      (mrc_params->rd_odt_value == 0 ? BIT12 : 0));
> +
> +       LEAVEFN();
> +}
> +
> +
> +/*
> + * This function performs some initialization on the DDRIO unit.
> + * This function is dependent on BOARD_ID, DDR_SPEED, and CHANNEL_ENABLES.
> + */
> +void ddrphy_init(struct mrc_params *mrc_params)
> +{
> +       uint32_t temp;
> +       uint8_t ch;     /* channel counter */
> +       uint8_t rk;     /* rank counter */
> +       uint8_t bl_grp; /*  byte lane group counter (2 BLs per module) */
> +       uint8_t bl_divisor = 1; /* byte lane divisor */
> +       /* For DDR3 --> 0 == 800, 1 == 1066, 2 == 1333 */
> +       uint8_t speed = mrc_params->ddr_speed & (BIT1 | BIT0);
> +       uint8_t cas;
> +       uint8_t cwl;
> +
> +       ENTERFN();
> +
> +       cas = mrc_params->params.cl;
> +       cwl = 5 + mrc_params->ddr_speed;
> +
> +       /* ddrphy_init starts */
> +       mrc_post_code(0x03, 0x00);
> +
> +       /*
> +        * HSD#231531
> +        * Make sure IOBUFACT is deasserted before initializing the DDR PHY
> +        *
> +        * HSD#234845
> +        * Make sure WRPTRENABLE is deasserted before initializing the DDR PHY
> +        */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* Deassert DDRPHY Initialization Complete */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ~BIT20, BIT20); /* SPID_INIT_COMPLETE=0 */
> +                       /* Deassert IOBUFACT */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ~BIT2, BIT2);   /* IOBUFACTRST_N=0 */
> +                       /* Disable WRPTR */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ~BIT0, BIT0);   /* WRPTRENABLE=0 */
> +               }
> +       }
> +
> +       /* Put PHY in reset */
> +       mrc_alt_write_mask(DDRPHY, MASTERRSTN, 0, BIT0);
> +
> +       /* Initialize DQ01, DQ23, CMD, CLK-CTL, COMP modules */
> +
> +       /* STEP0 */

Can you put each step in its own static function?

for (ch = 0; ch < NUM_CHANNELS; ch++)
    step0(ch);
for (ch = 0; ch < NUM_CHANNELS; ch++)
    step1(ch);

etc.

> +       mrc_post_code(0x03, 0x10);
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* DQ01-DQ23 */
> +                       for (bl_grp = 0;
> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor)/2);
> +                            bl_grp++) {
> +                               /* Analog MUX select - IO2xCLKSEL */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQOBSCKEBBCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       ((bl_grp) ? (0x00) : (BIT22)), (BIT22));
> +
> +                               /* ODT Strength */
> +                               switch (mrc_params->rd_odt_value) {
> +                               case 1:
> +                                       temp = 0x3;
> +                                       break;  /* 60 ohm */
> +                               case 2:
> +                                       temp = 0x3;
> +                                       break;  /* 120 ohm */
> +                               case 3:
> +                                       temp = 0x3;
> +                                       break;  /* 180 ohm */
> +                               default:
> +                                       temp = 0x3;
> +                                       break;  /* 120 ohm */
> +                               }
> +
> +                               /* ODT strength */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0RXIOBUFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (temp << 5), (BIT6 | BIT5));
> +                               /* ODT strength */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1RXIOBUFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (temp << 5), (BIT6 | BIT5));
> +
> +                               /* Dynamic ODT/DIFFAMP */
> +                               temp = (((cas) << 24) | ((cas) << 16) |
> +                                       ((cas) << 8) | ((cas) << 0));
> +                               switch (speed) {
> +                               case 0:
> +                                       temp -= 0x01010101;
> +                                       break;  /* 800 */
> +                               case 1:
> +                                       temp -= 0x02020202;
> +                                       break;  /* 1066 */
> +                               case 2:
> +                                       temp -= 0x03030303;
> +                                       break;  /* 1333 */
> +                               case 3:
> +                                       temp -= 0x04040404;
> +                                       break;  /* 1600 */
> +                               }
> +
> +                               /* Launch Time: ODT, DIFFAMP, ODT, DIFFAMP */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B01LATCTL1 +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       temp,
> +                                       (BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
> +                                       BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
> +                                       BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +                                       BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
> +                               switch (speed) {
> +                               /* HSD#234715 */
> +                               case 0:
> +                                       temp = ((0x06 << 16) | (0x07 << 8));
> +                                       break;  /* 800 */
> +                               case 1:
> +                                       temp = ((0x07 << 16) | (0x08 << 8));
> +                                       break;  /* 1066 */
> +                               case 2:
> +                                       temp = ((0x09 << 16) | (0x0A << 8));
> +                                       break;  /* 1333 */
> +                               case 3:
> +                                       temp = ((0x0A << 16) | (0x0B << 8));
> +                                       break;  /* 1600 */
> +                               }
> +
> +                               /* On Duration: ODT, DIFFAMP */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0ONDURCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       temp,
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                                       BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
> +                                       BIT9 | BIT8));
> +                               /* On Duration: ODT, DIFFAMP */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1ONDURCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       temp,
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                                       BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
> +                                       BIT9 | BIT8));
> +
> +                               switch (mrc_params->rd_odt_value) {
> +                               case 0:
> +                                       /* override DIFFAMP=on, ODT=off */
> +                                       temp = ((0x3F << 16) | (0x3f << 10));
> +                                       break;
> +                               default:
> +                                       /* override DIFFAMP=on, ODT=on */
> +                                       temp = ((0x3F << 16) | (0x2A << 10));
> +                                       break;
> +                               }
> +
> +                               /* Override: DIFFAMP, ODT */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0OVRCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       temp,
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                                       BIT16 | BIT15 | BIT14 | BIT13 | BIT12 |
> +                                       BIT11 | BIT10));
> +                               /* Override: DIFFAMP, ODT */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1OVRCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       temp,
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                                       BIT16 | BIT15 | BIT14 | BIT13 | BIT12 |
> +                                       BIT11 | BIT10));
> +
> +                               /* DLL Setup */
> +
> +                               /* 1xCLK Domain Timings: tEDP,RCVEN,WDQS (PO) */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0LATCTL0 +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (((cas + 7) << 16) | ((cas - 4) << 8) |
> +                                       ((cwl - 2) << 0)),
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                                       BIT16 | BIT12 | BIT11 | BIT10 | BIT9 |
> +                                       BIT8 | BIT4 | BIT3 | BIT2 | BIT1 |
> +                                       BIT0));
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1LATCTL0 +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (((cas + 7) << 16) | ((cas - 4) << 8) |
> +                                       ((cwl - 2) << 0)),
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                                       BIT16 | BIT12 | BIT11 | BIT10 | BIT9 |
> +                                       BIT8 | BIT4 | BIT3 | BIT2 | BIT1 |
> +                                       BIT0));
> +
> +                               /* RCVEN Bypass (PO) */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0RXIOBUFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       ((0x0 << 7) | (0x0 << 0)),
> +                                       (BIT7 | BIT0));
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1RXIOBUFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       ((0x0 << 7) | (0x0 << 0)),
> +                                       (BIT7 | BIT0));
> +
> +                               /* TX */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (BIT16), (BIT16));
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B01PTRCTL1 +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (BIT8), (BIT8));
> +
> +                               /* RX (PO) */
> +                               /* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0VREFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       ((0x03 << 2) | (0x0 << 1) | (0x0 << 0)),
> +                                       (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
> +                                       BIT2 | BIT1 | BIT0));
> +                               /* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1VREFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       ((0x03 << 2) | (0x0 << 1) | (0x0 << 0)),
> +                                       (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
> +                                       BIT2 | BIT1 | BIT0));
> +                               /* Per-Bit De-Skew Enable */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0RXIOBUFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (0), (BIT4));
> +                               /* Per-Bit De-Skew Enable */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1RXIOBUFCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (0), (BIT4));
> +                       }
> +
> +                       /* CLKEBB */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               0, (BIT23));
> +
> +                       /* Enable tristate control of cmd/address bus */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               0, (BIT1 | BIT0));
> +
> +                       /* ODT RCOMP */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x03 << 5) | (0x03 << 0)),
> +                               (BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 |
> +                               BIT3 | BIT2 | BIT1 | BIT0));
> +
> +                       /* CMDPM* registers must be programmed in this order */
> +
> +                       /* Turn On Delays: SFR (regulator), MPLL */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMDLYREG4 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0xFFFFU << 16) | (0xFFFF << 0)),
> +                               0xFFFFFFFF);
> +                       /*
> +                        * Delays: ASSERT_IOBUFACT_to_ALLON0_for_PM_MSG_3,
> +                        * VREG (MDLL) Turn On, ALLON0_to_DEASSERT_IOBUFACT
> +                        * for_PM_MSG_gt0, MDLL Turn On
> +                        */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMDLYREG3 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0xFU << 28) | (0xFFF << 16) | (0xF << 12) |
> +                               (0x616 << 0)), 0xFFFFFFFF);
> +                       /* MPLL Divider Reset Delays */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMDLYREG2 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
> +                               (0xFF << 0)), 0xFFFFFFFF);
> +                       /* Turn Off Delays: VREG, Staggered MDLL, MDLL, PI */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMDLYREG1 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
> +                               (0xFF << 0)), 0xFFFFFFFF);
> +                       /* Turn On Delays: MPLL, Staggered MDLL, PI, IOBUFACT */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMDLYREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
> +                               (0xFF << 0)), 0xFFFFFFFF);
> +                       /* Allow PUnit signals */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x6 << 8) | BIT6 | (0x4 << 0)),
> +                               (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 |
> +                               BIT25 | BIT24 | BIT23 | BIT22 | BIT21 | BIT11 |
> +                               BIT10 | BIT9 | BIT8 | BIT6 | BIT3 | BIT2 |
> +                               BIT1 | BIT0));
> +                       /* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x3 << 4) | (0x7 << 0)),
> +                               (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 |
> +                               BIT0));
> +
> +                       /* CLK-CTL */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               0, BIT24);      /* CLKEBB */
> +                       /* Buffer Enable: CS,CKE,ODT,CLK */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x0 << 16) | (0x0 << 12) | (0x0 << 8) |
> +                               (0xF << 4) | BIT0),
> +                               (BIT19 | BIT18 | BIT17 | BIT16 | BIT15 | BIT14 |
> +                               BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +                               BIT7 | BIT6 | BIT5 | BIT4 | BIT0));
> +                       /* ODT RCOMP */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x03 << 8) | (0x03 << 0)),
> +                               (BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | BIT4 |
> +                               BIT3 | BIT2 | BIT1 | BIT0));
> +                       /* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x3 << 4) | (0x7 << 0)),
> +                               (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 |
> +                               BIT0));
> +
> +                       /*
> +                        * COMP (RON channel specific)
> +                        * - DQ/DQS/DM RON: 32 Ohm
> +                        * - CTRL/CMD RON: 27 Ohm
> +                        * - CLK RON: 26 Ohm
> +                        */
> +                       /* RCOMP Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQVREFCH0 +  (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x08 << 24) | (0x03 << 16)),
> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
> +                               BIT17 | BIT16));
> +                       /* RCOMP Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x0C << 24) | (0x03 << 16)),
> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
> +                               BIT17 | BIT16));
> +                       /* RCOMP Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x0F << 24) | (0x03 << 16)),
> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
> +                               BIT17 | BIT16));
> +                       /* RCOMP Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x08 << 24) | (0x03 << 16)),
> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
> +                               BIT17 | BIT16));
> +                       /* RCOMP Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CTLVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x0C << 24) | (0x03 << 16)),
> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
> +                               BIT17 | BIT16));
> +
> +                       /* DQS Swapped Input Enable */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (COMPEN1CH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT19 | BIT17),
> +                               (BIT31 | BIT30 | BIT19 | BIT17 |
> +                               BIT15 | BIT14));
> +
> +                       /* ODT VREF = 1.5 x 274/360+274 = 0.65V (code of ~50) */
> +                       /* ODT Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x32 << 8) | (0x03 << 0)),
> +                               (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +                               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
> +                       /* ODT Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x32 << 8) | (0x03 << 0)),
> +                               (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +                               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
> +                       /* ODT Vref PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x0E << 8) | (0x05 << 0)),
> +                               (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +                               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
> +
> +                       /*
> +                        * Slew rate settings are frequency specific,
> +                        * numbers below are for 800Mhz (speed == 0)
> +                        * - DQ/DQS/DM/CLK SR: 4V/ns,
> +                        * - CTRL/CMD SR: 1.5V/ns
> +                        */
> +                       temp = (0x0E << 16) | (0x0E << 12) | (0x08 << 8) |
> +                               (0x0B << 4) | (0x0B << 0);
> +                       /* DCOMP Delay Select: CTL,CMD,CLK,DQS,DQ */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DLYSELCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               temp,
> +                               (BIT19 | BIT18 | BIT17 | BIT16 | BIT15 |
> +                               BIT14 | BIT13 | BIT12 | BIT11 | BIT10 |
> +                               BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 |
> +                               BIT3 | BIT2 | BIT1 | BIT0));
> +                       /* TCO Vref CLK,DQS,DQ */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (TCOVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x05 << 16) | (0x05 << 8) | (0x05 << 0)),
> +                               (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                               BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
> +                               BIT9 | BIT8 | BIT5 | BIT4 | BIT3 | BIT2 |
> +                               BIT1 | BIT0));
> +                       /* ODTCOMP CMD/CTL PU/PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCBUFODTCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ((0x03 << 8) | (0x03 << 0)),
> +                               (BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
> +                               BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
> +                       /* COMP */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               0, (BIT31 | BIT30 | BIT8));
> +
> +#ifdef BACKUP_COMPS
> +                       /* DQ COMP Overrides */
> +                       /* RCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* RCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x10 << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x10 << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* ODTCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* ODTCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* TCOCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31), (BIT31));
> +                       /* TCOCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31), (BIT31));
> +
> +                       /* DQS COMP Overrides */
> +                       /* RCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* RCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x10 << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x10 << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* ODTCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* ODTCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* TCOCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31), (BIT31));
> +                       /* TCOCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31), (BIT31));
> +
> +                       /* CLK COMP Overrides */
> +                       /* RCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0C << 16)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* RCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0C << 16)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x07 << 16)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x07 << 16)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* ODTCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* ODTCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | (0x0B << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* TCOCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31), (BIT31));
> +                       /* TCOCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31), (BIT31));
> +
> +                       /* CMD COMP Overrides */
> +                       /* RCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0D << 16)),
> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* RCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0D << 16)),
> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +
> +                       /* CTL COMP Overrides */
> +                       /* RCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CTLDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0D << 16)),
> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* RCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CTLDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0D << 16)),
> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CTLDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* DCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CTLDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x0A << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +#else
> +                       /* DQ TCOCOMP Overrides */
> +                       /* TCOCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x1F << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* TCOCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x1F << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +
> +                       /* DQS TCOCOMP Overrides */
> +                       /* TCOCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x1F << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* TCOCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x1F << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +
> +                       /* CLK TCOCOMP Overrides */
> +                       /* TCOCOMP PU */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x1F << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +                       /* TCOCOMP PD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               (BIT31 | (0x1F << 16)),
> +                               (BIT31 | BIT20 | BIT19 |
> +                               BIT18 | BIT17 | BIT16));
> +#endif
> +
> +                       /* program STATIC delays */
> +#ifdef BACKUP_WCMD
> +                       set_wcmd(ch, ddr_wcmd[PLATFORM_ID]);
> +#else
> +                       set_wcmd(ch, ddr_wclk[PLATFORM_ID] + HALF_CLK);
> +#endif
> +
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1<<rk)) {
> +                                       set_wclk(ch, rk, ddr_wclk[PLATFORM_ID]);
> +#ifdef BACKUP_WCTL
> +                                       set_wctl(ch, rk, ddr_wctl[PLATFORM_ID]);
> +#else
> +                                       set_wctl(ch, rk, ddr_wclk[PLATFORM_ID] + HALF_CLK);
> +#endif
> +                               }
> +                       }
> +               }
> +       }
> +
> +       /* COMP (non channel specific) */
> +       /* RCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANADRVPUCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANADRVPDCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CMDANADRVPUCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CMDANADRVPDCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANADRVPUCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANADRVPDCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANADRVPUCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANADRVPDCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CTLANADRVPUCTL), (BIT30), (BIT30));
> +       /* RCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CTLANADRVPDCTL), (BIT30), (BIT30));
> +       /* ODT: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANAODTPUCTL), (BIT30), (BIT30));
> +       /* ODT: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANAODTPDCTL), (BIT30), (BIT30));
> +       /* ODT: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANAODTPUCTL), (BIT30), (BIT30));
> +       /* ODT: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANAODTPDCTL), (BIT30), (BIT30));
> +       /* ODT: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANAODTPUCTL), (BIT30), (BIT30));
> +       /* ODT: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANAODTPDCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANADLYPUCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANADLYPDCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CMDANADLYPUCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CMDANADLYPDCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANADLYPUCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANADLYPDCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANADLYPUCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANADLYPDCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CTLANADLYPUCTL), (BIT30), (BIT30));
> +       /* DCOMP: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CTLANADLYPDCTL), (BIT30), (BIT30));
> +       /* TCO: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANATCOPUCTL), (BIT30), (BIT30));
> +       /* TCO: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQANATCOPDCTL), (BIT30), (BIT30));
> +       /* TCO: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANATCOPUCTL), (BIT30), (BIT30));
> +       /* TCO: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (CLKANATCOPDCTL), (BIT30), (BIT30));
> +       /* TCO: Dither PU Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANATCOPUCTL), (BIT30), (BIT30));
> +       /* TCO: Dither PD Enable */
> +       mrc_alt_write_mask(DDRPHY, (DQSANATCOPDCTL), (BIT30), (BIT30));
> +       /* TCOCOMP: Pulse Count */
> +       mrc_alt_write_mask(DDRPHY, (TCOCNTCTRL), (0x1<<0), (BIT1|BIT0));
> +       /* ODT: CMD/CTL PD/PU */
> +       mrc_alt_write_mask(DDRPHY,
> +               (CHNLBUFSTATIC), ((0x03<<24)|(0x03<<16)),
> +               (BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
> +               BIT20 | BIT19 | BIT18 | BIT17 | BIT16));
> +       /* Set 1us counter */
> +       mrc_alt_write_mask(DDRPHY,
> +               (MSCNTR), (0x64 << 0),
> +               (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
> +       mrc_alt_write_mask(DDRPHY,
> +               (LATCH1CTL), (0x1 << 28),
> +               (BIT30 | BIT29 | BIT28));
> +
> +       /* Release PHY from reset */
> +       mrc_alt_write_mask(DDRPHY, MASTERRSTN, BIT0, BIT0);
> +
> +       /* STEP1 */
> +       mrc_post_code(0x03, 0x11);
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* DQ01-DQ23 */
> +                       for (bl_grp = 0;
> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
> +                            bl_grp++) {
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQMDLLCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (BIT13),
> +                                       (BIT13));       /* Enable VREG */
> +                               delay_n(3);
> +                       }
> +
> +                       /* ECC */
> +                       mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL),
> +                               (BIT13), (BIT13));      /* Enable VREG */
> +                       delay_n(3);
> +                       /* CMD */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               (BIT13), (BIT13));      /* Enable VREG */
> +                       delay_n(3);
> +                       /* CLK-CTL */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               (BIT13), (BIT13));      /* Enable VREG */
> +                       delay_n(3);
> +               }
> +       }
> +
> +       /* STEP2 */
> +       mrc_post_code(0x03, 0x12);
> +       delay_n(200);
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* DQ01-DQ23 */
> +                       for (bl_grp = 0;
> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
> +                            bl_grp++) {
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQMDLLCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (BIT17),
> +                                       (BIT17));       /* Enable MCDLL */
> +                               delay_n(50);
> +                       }
> +
> +               /* ECC */
> +               mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL),
> +                       (BIT17), (BIT17));      /* Enable MCDLL */
> +               delay_n(50);
> +               /* CMD */
> +               mrc_alt_write_mask(DDRPHY,
> +                       (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                       (BIT18), (BIT18));      /* Enable MCDLL */
> +               delay_n(50);
> +               /* CLK-CTL */
> +               mrc_alt_write_mask(DDRPHY,
> +                       (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                       (BIT18), (BIT18));      /* Enable MCDLL */
> +               delay_n(50);
> +               }
> +       }
> +
> +       /* STEP3: */
> +       mrc_post_code(0x03, 0x13);
> +       delay_n(100);
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* DQ01-DQ23 */
> +                       for (bl_grp = 0;
> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
> +                            bl_grp++) {
> +#ifdef FORCE_16BIT_DDRIO
> +                               temp = ((bl_grp) &&
> +                                       (mrc_params->channel_width == X16)) ?
> +                                       ((0x1 << 12) | (0x1 << 8) |
> +                                       (0xF << 4) | (0xF << 0)) :
> +                                       ((0xF << 12) | (0xF << 8) |
> +                                       (0xF << 4) | (0xF << 0));
> +#else
> +                               temp = ((0xF << 12) | (0xF << 8) |
> +                                       (0xF << 4) | (0xF << 0));
> +#endif
> +                               /* Enable TXDLL */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQDLLTXCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       temp, 0xFFFF);
> +                               delay_n(3);
> +                               /* Enable RXDLL */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQDLLRXCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (BIT3 | BIT2 | BIT1 | BIT0),
> +                                       (BIT3 | BIT2 | BIT1 | BIT0));
> +                               delay_n(3);
> +                               /* Enable RXDLL Overrides BL0 */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0OVRCTL +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (BIT3 | BIT2 | BIT1 | BIT0),
> +                                       (BIT3 | BIT2 | BIT1 | BIT0));
> +                       }
> +
> +                       /* ECC */
> +                       temp = ((0xF << 12) | (0xF << 8) |
> +                               (0xF << 4) | (0xF << 0));
> +                       mrc_alt_write_mask(DDRPHY, (ECCDLLTXCTL),
> +                               temp, 0xFFFF);
> +                       delay_n(3);
> +
> +                       /* CMD (PO) */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDDLLTXCTL + (ch * DDRIOCCC_CH_OFFSET)),
> +                               temp, 0xFFFF);
> +                       delay_n(3);
> +               }
> +       }
> +
> +       /* STEP4 */
> +       mrc_post_code(0x03, 0x14);
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* Host To Memory Clock Alignment (HMC) for 800/1066 */
> +                       for (bl_grp = 0;
> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
> +                            bl_grp++) {
> +                               /* CLK_ALIGN_MOD_ID */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (DQCLKALIGNREG2 +
> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
> +                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                       (bl_grp) ? (0x3) : (0x1),
> +                                       (BIT3 | BIT2 | BIT1 | BIT0));
> +                       }
> +
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (ECCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
> +                               0x2,
> +                               (BIT3 | BIT2 | BIT1 | BIT0));
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
> +                               0x0,
> +                               (BIT3 | BIT2 | BIT1 | BIT0));
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
> +                               0x2,
> +                               (BIT3 | BIT2 | BIT1 | BIT0));
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               (0x2 << 4), (BIT5 | BIT4));
> +                       /*
> +                        * NUM_SAMPLES, MAX_SAMPLES,
> +                        * MACRO_PI_STEP, MICRO_PI_STEP
> +                        */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCLKALIGNREG1 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x18 << 16) | (0x10 << 8) |
> +                               (0x8 << 2) | (0x1 << 0)),
> +                               (BIT22 | BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
> +                               BIT16 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 |
> +                               BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
> +                               BIT2 | BIT1 | BIT0));
> +                       /* TOTAL_NUM_MODULES, FIRST_U_PARTITION */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCLKALIGNREG2 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               ((0x10 << 16) | (0x4 << 8) | (0x2 << 4)),
> +                               (BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
> +                               BIT11 | BIT10 | BIT9 | BIT8 | BIT7 | BIT6 |
> +                               BIT5 | BIT4));
> +#ifdef HMC_TEST
> +                       /* START_CLK_ALIGN=1 */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               BIT24, BIT24);
> +                       while (msg_port_alt_read(DDRPHY,
> +                               (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET))) &
> +                               BIT24)
> +                               ;       /* wait for START_CLK_ALIGN=0 */
> +#endif
> +
> +                       /* Set RD/WR Pointer Seperation & COUNTEN & FIFOPTREN */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)),
> +                               BIT0, BIT0);    /* WRPTRENABLE=1 */
> +
> +                       /* COMP initial */
> +                       /* enable bypass for CLK buffer (PO) */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               BIT5, BIT5);
> +                       /* Initial COMP Enable */
> +                       mrc_alt_write_mask(DDRPHY, (CMPCTRL),
> +                               (BIT0), (BIT0));
> +                       /* wait for Initial COMP Enable = 0 */
> +                       while (msg_port_alt_read(DDRPHY, (CMPCTRL)) & BIT0)
> +                               ;
> +                       /* disable bypass for CLK buffer (PO) */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
> +                               ~BIT5, BIT5);
> +
> +                       /* IOBUFACT */
> +
> +                       /* STEP4a */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               BIT2, BIT2);    /* IOBUFACTRST_N=1 */
> +
> +                       /* DDRPHY initialisation complete */
> +                       mrc_alt_write_mask(DDRPHY,
> +                               (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
> +                               BIT20, BIT20);  /* SPID_INIT_COMPLETE=1 */
> +               }
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/* This function performs JEDEC initialisation on all enabled channels */
> +void perform_jedec_init(struct mrc_params *mrc_params)
> +{
> +       uint8_t twr, wl, rank;
> +       uint32_t tck;
> +       u32 dtr0;
> +       u32 drp;
> +       u32 drmc;
> +       u32 mrs0_cmd = 0;
> +       u32 emrs1_cmd = 0;
> +       u32 emrs2_cmd = 0;
> +       u32 emrs3_cmd = 0;
> +
> +       ENTERFN();
> +
> +       /* jedec_init starts */
> +       mrc_post_code(0x04, 0x00);
> +
> +       /* DDR3_RESET_SET=0, DDR3_RESET_RESET=1 */
> +       mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT1, (BIT8 | BIT1));
> +
> +       /* Assert RESET# for 200us */
> +       delay_u(200);
> +
> +       /* DDR3_RESET_SET=1, DDR3_RESET_RESET=0 */
> +       mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT8, (BIT8 | BIT1));
> +
> +       dtr0 = msg_port_read(MEM_CTLR, DTR0);
> +
> +       /*
> +        * Set CKEVAL for populated ranks
> +        * then send NOP to each rank (#4550197)
> +        */
> +
> +       drp = msg_port_read(MEM_CTLR, DRP);
> +       drp &= 0x3;
> +
> +       drmc = msg_port_read(MEM_CTLR, DRMC);
> +       drmc &= 0xFFFFFFFC;
> +       drmc |= (BIT4 | drp);
> +
> +       msg_port_write(MEM_CTLR, DRMC, drmc);
> +
> +       for (rank = 0; rank < NUM_RANKS; rank++) {
> +               /* Skip to next populated rank */
> +               if ((mrc_params->rank_enables & (1 << rank)) == 0)
> +                       continue;
> +
> +               dram_init_command(DCMD_NOP(rank));
> +       }
> +
> +       msg_port_write(MEM_CTLR, DRMC,
> +               (mrc_params->rd_odt_value == 0 ? BIT12 : 0));
> +
> +       /*
> +        * setup for emrs 2
> +        * BIT[15:11] --> Always "0"
> +        * BIT[10:09] --> Rtt_WR: want "Dynamic ODT Off" (0)
> +        * BIT[08]    --> Always "0"
> +        * BIT[07]    --> SRT: use sr_temp_range
> +        * BIT[06]    --> ASR: want "Manual SR Reference" (0)
> +        * BIT[05:03] --> CWL: use oem_tCWL
> +        * BIT[02:00] --> PASR: want "Full Array" (0)
> +        */
> +       emrs2_cmd |= (2 << 3);
> +       wl = 5 + mrc_params->ddr_speed;
> +       emrs2_cmd |= ((wl - 5) << 9);
> +       emrs2_cmd |= (mrc_params->sr_temp_range << 13);
> +
> +       /*
> +        * setup for emrs 3
> +        * BIT[15:03] --> Always "0"
> +        * BIT[02]    --> MPR: want "Normal Operation" (0)
> +        * BIT[01:00] --> MPR_Loc: want "Predefined Pattern" (0)
> +        */
> +       emrs3_cmd |= (3 << 3);
> +
> +       /*
> +        * setup for emrs 1
> +        * BIT[15:13]     --> Always "0"
> +        * BIT[12:12]     --> Qoff: want "Output Buffer Enabled" (0)
> +        * BIT[11:11]     --> TDQS: want "Disabled" (0)
> +        * BIT[10:10]     --> Always "0"
> +        * BIT[09,06,02]  --> Rtt_nom: use rtt_nom_value
> +        * BIT[08]        --> Always "0"
> +        * BIT[07]        --> WR_LVL: want "Disabled" (0)
> +        * BIT[05,01]     --> DIC: use ron_value
> +        * BIT[04:03]     --> AL: additive latency want "0" (0)
> +        * BIT[00]        --> DLL: want "Enable" (0)
> +        *
> +        * (BIT5|BIT1) set Ron value
> +        * 00 --> RZQ/6 (40ohm)
> +        * 01 --> RZQ/7 (34ohm)
> +        * 1* --> RESERVED
> +        *
> +        * (BIT9|BIT6|BIT2) set Rtt_nom value
> +        * 000 --> Disabled
> +        * 001 --> RZQ/4 ( 60ohm)
> +        * 010 --> RZQ/2 (120ohm)
> +        * 011 --> RZQ/6 ( 40ohm)
> +        * 1** --> RESERVED
> +        */

Why oh why not just have #defines for these? It seems like the
original author knew they should be created but never made the step of
actually doing it.


> +       emrs1_cmd |= (1 << 3);
> +       emrs1_cmd &= ~BIT6;
> +
> +       if (mrc_params->ron_value == 0)
> +               emrs1_cmd |= BIT7;
> +       else
> +               emrs1_cmd &= ~BIT7;
> +
> +       if (mrc_params->rtt_nom_value == 0)
> +               emrs1_cmd |= (DDR3_EMRS1_RTTNOM_40 << 6);
> +       else if (mrc_params->rtt_nom_value == 1)
> +               emrs1_cmd |= (DDR3_EMRS1_RTTNOM_60 << 6);
> +       else if (mrc_params->rtt_nom_value == 2)
> +               emrs1_cmd |= (DDR3_EMRS1_RTTNOM_120 << 6);
> +
> +       /* save MRS1 value (excluding control fields) */
> +       mrc_params->mrs1 = emrs1_cmd >> 6;
> +
> +       /*
> +        * setup for mrs 0
> +        * BIT[15:13]     --> Always "0"
> +        * BIT[12]        --> PPD: for Quark (1)
> +        * BIT[11:09]     --> WR: use oem_tWR
> +        * BIT[08]        --> DLL: want "Reset" (1, self clearing)
> +        * BIT[07]        --> MODE: want "Normal" (0)
> +        * BIT[06:04,02]  --> CL: use oem_tCAS
> +        * BIT[03]        --> RD_BURST_TYPE: want "Interleave" (1)
> +        * BIT[01:00]     --> BL: want "8 Fixed" (0)
> +        * WR:
> +        * 0 --> 16
> +        * 1 --> 5
> +        * 2 --> 6
> +        * 3 --> 7
> +        * 4 --> 8
> +        * 5 --> 10
> +        * 6 --> 12
> +        * 7 --> 14
> +        * CL:
> +        * BIT[02:02] "0" if oem_tCAS <= 11 (1866?)
> +        * BIT[06:04] use oem_tCAS-4
> +        */
> +       mrs0_cmd |= BIT14;
> +       mrs0_cmd |= BIT18;
> +       mrs0_cmd |= ((((dtr0 >> 12) & 7) + 1) << 10);
> +
> +       tck = t_ck[mrc_params->ddr_speed];
> +       /* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */
> +       twr = MCEIL(15000, tck);
> +       mrs0_cmd |= ((twr - 4) << 15);
> +
> +       for (rank = 0; rank < NUM_RANKS; rank++) {
> +               /* Skip to next populated rank */
> +               if ((mrc_params->rank_enables & (1 << rank)) == 0)
> +                       continue;
> +
> +               emrs2_cmd |= (rank << 22);
> +               dram_init_command(emrs2_cmd);
> +
> +               emrs3_cmd |= (rank << 22);
> +               dram_init_command(emrs3_cmd);
> +
> +               emrs1_cmd |= (rank << 22);
> +               dram_init_command(emrs1_cmd);
> +
> +               mrs0_cmd |= (rank << 22);
> +               dram_init_command(mrs0_cmd);
> +
> +               dram_init_command(DCMD_ZQCL(rank));
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Dunit Initialisation Complete
> + *
> + * Indicates that initialisation of the Dunit has completed.
> + *
> + * Memory accesses are permitted and maintenance operation begins.
> + * Until this bit is set to a 1, the memory controller will not accept
> + * DRAM requests from the MEMORY_MANAGER or HTE.
> + */
> +void set_ddr_init_complete(struct mrc_params *mrc_params)
> +{
> +       u32 dco;
> +
> +       ENTERFN();
> +
> +       dco = msg_port_read(MEM_CTLR, DCO);
> +       dco &= ~BIT28;
> +       dco |= BIT31;
> +       msg_port_write(MEM_CTLR, DCO, dco);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will retrieve relevant timing data
> + *
> + * This data will be used on subsequent boots to speed up boot times
> + * and is required for Suspend To RAM capabilities.
> + */
> +void restore_timings(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch, rk, bl;
> +       const struct mrc_timings *mt = &mrc_params->timings;
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               for (rk = 0; rk < NUM_RANKS; rk++) {
> +                       for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
> +                               set_rcvn(ch, rk, bl, mt->rcvn[ch][rk][bl]);
> +                               set_rdqs(ch, rk, bl, mt->rdqs[ch][rk][bl]);
> +                               set_wdqs(ch, rk, bl, mt->wdqs[ch][rk][bl]);
> +                               set_wdq(ch, rk, bl, mt->wdq[ch][rk][bl]);
> +                               if (rk == 0) {
> +                                       /* VREF (RANK0 only) */
> +                                       set_vref(ch, bl, mt->vref[ch][bl]);
> +                               }
> +                       }
> +                       set_wctl(ch, rk, mt->wctl[ch][rk]);
> +               }
> +               set_wcmd(ch, mt->wcmd[ch]);
> +       }
> +}
> +
> +/*
> + * Configure default settings normally set as part of read training
> + *
> + * Some defaults have to be set earlier as they may affect earlier
> + * training steps.
> + */
> +void default_timings(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch, rk, bl;
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               for (rk = 0; rk < NUM_RANKS; rk++) {
> +                       for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
> +                               set_rdqs(ch, rk, bl, 24);
> +                               if (rk == 0) {
> +                                       /* VREF (RANK0 only) */
> +                                       set_vref(ch, bl, 32);
> +                               }
> +                       }
> +               }
> +       }
> +}
> +
> +/*
> + * This function will perform our RCVEN Calibration Algorithm.
> + * We will only use the 2xCLK domain timings to perform RCVEN Calibration.
> + * All byte lanes will be calibrated "simultaneously" per channel per rank.
> + */
> +void rcvn_cal(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch;     /* channel counter */
> +       uint8_t rk;     /* rank counter */
> +       uint8_t bl;     /* byte lane counter */
> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +
> +#ifdef R2R_SHARING
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
> +#ifndef BACKUP_RCVN
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t num_ranks_enabled = 0;
> +#endif
> +#endif
> +
> +#ifdef BACKUP_RCVN
> +#else
> +       uint32_t temp;
> +       /* absolute PI value to be programmed on the byte lane */
> +       uint32_t delay[NUM_BYTE_LANES];
> +       u32 dtr1, dtr1_save;
> +#endif
> +
> +       ENTERFN();
> +
> +       /* rcvn_cal starts */
> +       mrc_post_code(0x05, 0x00);
> +
> +#ifndef BACKUP_RCVN
> +       /* need separate burst to sample DQS preamble */
> +       dtr1 = msg_port_read(MEM_CTLR, DTR1);
> +       dtr1_save = dtr1;
> +       dtr1 |= BIT12;
> +       msg_port_write(MEM_CTLR, DTR1, dtr1);
> +#endif
> +
> +#ifdef R2R_SHARING
> +       /* need to set "final_delay[][]" elements to "0" */
> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
> +#endif
> +
> +       /* loop through each enabled channel */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* perform RCVEN Calibration on a per rank basis */
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       /*
> +                                        * POST_CODE here indicates the current
> +                                        * channel and rank being calibrated
> +                                        */
> +                                       mrc_post_code(0x05, (0x10 + ((ch << 4) | rk)));
> +
> +#ifdef BACKUP_RCVN
> +                                       /* et hard-coded timing values */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++)
> +                                               set_rcvn(ch, rk, bl, ddr_rcvn[PLATFORM_ID]);
> +#else
> +                                       /* enable FIFORST */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) {
> +                                               mrc_alt_write_mask(DDRPHY,
> +                                                       (B01PTRCTL1 +
> +                                                       ((bl >> 1) * DDRIODQ_BL_OFFSET) +
> +                                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                                       0, BIT8);
> +                                       }
> +                                       /* initialize the starting delay to 128 PI (cas +1 CLK) */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               /* 1x CLK domain timing is cas-4 */
> +                                               delay[bl] = (4 + 1) * FULL_CLK;
> +
> +                                               set_rcvn(ch, rk, bl, delay[bl]);
> +                                       }
> +
> +                                       /* now find the rising edge */
> +                                       find_rising_edge(mrc_params, delay, ch, rk, true);
> +
> +                                       /* Now increase delay by 32 PI (1/4 CLK) to place in center of high pulse */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               delay[bl] += QRTR_CLK;
> +                                               set_rcvn(ch, rk, bl, delay[bl]);
> +                                       }
> +                                       /* Now decrement delay by 128 PI (1 CLK) until we sample a "0" */
> +                                       do {
> +                                               temp = sample_dqs(mrc_params, ch, rk, true);
> +                                               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                                       if (temp & (1 << bl)) {
> +                                                               if (delay[bl] >= FULL_CLK) {
> +                                                                       delay[bl] -= FULL_CLK;
> +                                                                       set_rcvn(ch, rk, bl, delay[bl]);
> +                                                               } else {
> +                                                                       /* not enough delay */
> +                                                                       training_message(ch, rk, bl);
> +                                                                       mrc_post_code(0xEE, 0x50);
> +                                                               }
> +                                                       }
> +                                               }
> +                                       } while (temp & 0xFF);
> +
> +#ifdef R2R_SHARING
> +                                       /* increment "num_ranks_enabled" */
> +                                       num_ranks_enabled++;
> +                                       /* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               delay[bl] += QRTR_CLK;
> +                                               /* add "delay[]" values to "final_delay[][]" for rolling average */
> +                                               final_delay[ch][bl] += delay[bl];
> +                                               /* set timing based on rolling average values */
> +                                               set_rcvn(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
> +                                       }
> +#else
> +                                       /* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               delay[bl] += QRTR_CLK;
> +                                               set_rcvn(ch, rk, bl, delay[bl]);
> +                                       }
> +#endif
> +
> +                                       /* disable FIFORST */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) {
> +                                               mrc_alt_write_mask(DDRPHY,
> +                                                       (B01PTRCTL1 +
> +                                                       ((bl >> 1) * DDRIODQ_BL_OFFSET) +
> +                                                       (ch * DDRIODQ_CH_OFFSET)),
> +                                                       BIT8, BIT8);
> +                                       }
> +#endif
> +                               }
> +                       }
> +               }
> +       }
> +
> +#ifndef BACKUP_RCVN
> +       /* restore original */
> +       msg_port_write(MEM_CTLR, DTR1, dtr1_save);
> +#endif
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will perform the Write Levelling algorithm
> + * (align WCLK and WDQS).
> + *
> + * This algorithm will act on each rank in each channel separately.
> + */
> +void wr_level(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch;     /* channel counter */
> +       uint8_t rk;     /* rank counter */
> +       uint8_t bl;     /* byte lane counter */
> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +
> +#ifdef R2R_SHARING
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
> +#ifndef BACKUP_WDQS
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t num_ranks_enabled = 0;
> +#endif
> +#endif
> +
> +#ifdef BACKUP_WDQS
> +#else
> +       /* determines stop condition for CRS_WR_LVL */
> +       bool all_edges_found;
> +       /* absolute PI value to be programmed on the byte lane */
> +       uint32_t delay[NUM_BYTE_LANES];
> +       /*
> +        * static makes it so the data is loaded in the heap once by shadow(),
> +        * where non-static copies the data onto the stack every time this
> +        * function is called
> +        */
> +       uint32_t address;       /* address to be checked during COARSE_WR_LVL */
> +       u32 dtr4, dtr4_save;
> +#endif
> +
> +       ENTERFN();
> +
> +       /* wr_level starts */
> +       mrc_post_code(0x06, 0x00);
> +
> +#ifdef R2R_SHARING
> +       /* need to set "final_delay[][]" elements to "0" */
> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
> +#endif
> +
> +       /* loop through each enabled channel */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       /* perform WRITE LEVELING algorithm on a per rank basis */
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       /*
> +                                        * POST_CODE here indicates the current
> +                                        * rank and channel being calibrated
> +                                        */
> +                                       mrc_post_code(0x06, (0x10 + ((ch << 4) | rk)));
> +
> +#ifdef BACKUP_WDQS
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               set_wdqs(ch, rk, bl, ddr_wdqs[PLATFORM_ID]);
> +                                               set_wdq(ch, rk, bl, (ddr_wdqs[PLATFORM_ID] - QRTR_CLK));
> +                                       }
> +#else
> +                                       /*
> +                                        * perform a single PRECHARGE_ALL command to
> +                                        * make DRAM state machine go to IDLE state
> +                                        */
> +                                       dram_init_command(DCMD_PREA(rk));
> +
> +                                       /*
> +                                        * enable Write Levelling Mode
> +                                        * (EMRS1 w/ Write Levelling Mode Enable)
> +                                        */
> +                                       dram_init_command(DCMD_MRS1(rk, 0x0082));
> +
> +                                       /*
> +                                        * set ODT DRAM Full Time Termination
> +                                        * disable in MCU
> +                                        */
> +
> +                                       dtr4 = msg_port_read(MEM_CTLR, DTR4);
> +                                       dtr4_save = dtr4;
> +                                       dtr4 |= BIT15;
> +                                       msg_port_write(MEM_CTLR, DTR4, dtr4);
> +
> +                                       for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
> +                                               /*
> +                                                * Enable Sandy Bridge Mode (WDQ Tri-State) &
> +                                                * Ensure 5 WDQS pulses during Write Leveling
> +                                                */
> +                                               mrc_alt_write_mask(DDRPHY,
> +                                                       DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch),
> +                                                       (BIT28 | BIT8 | BIT6 | BIT4 | BIT2),
> +                                                       (BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
> +                                       }
> +
> +                                       /* Write Leveling Mode enabled in IO */
> +                                       mrc_alt_write_mask(DDRPHY,
> +                                               CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch),
> +                                               BIT16, BIT16);
> +
> +                                       /* Initialize the starting delay to WCLK */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               /*
> +                                                * CLK0 --> RK0
> +                                                * CLK1 --> RK1
> +                                                */
> +                                               delay[bl] = get_wclk(ch, rk);
> +
> +                                               set_wdqs(ch, rk, bl, delay[bl]);
> +                                       }
> +
> +                                       /* now find the rising edge */
> +                                       find_rising_edge(mrc_params, delay, ch, rk, false);
> +
> +                                       /* disable Write Levelling Mode */
> +                                       mrc_alt_write_mask(DDRPHY,
> +                                               CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch),
> +                                               0, BIT16);
> +
> +                                       for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
> +                                               /* Disable Sandy Bridge Mode & Ensure 4 WDQS pulses during normal operation */
> +                                               mrc_alt_write_mask(DDRPHY,
> +                                                       DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch),
> +                                                       (BIT8 | BIT6 | BIT4 | BIT2),
> +                                                       (BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
> +                                       }
> +
> +                                       /* restore original DTR4 */
> +                                       msg_port_write(MEM_CTLR, DTR4, dtr4_save);
> +
> +                                       /*
> +                                        * restore original value
> +                                        * (Write Levelling Mode Disable)
> +                                        */
> +                                       dram_init_command(DCMD_MRS1(rk, mrc_params->mrs1));
> +
> +                                       /*
> +                                        * perform a single PRECHARGE_ALL command to
> +                                        * make DRAM state machine go to IDLE state
> +                                        */
> +                                       dram_init_command(DCMD_PREA(rk));
> +
> +                                       mrc_post_code(0x06, (0x30 + ((ch << 4) | rk)));
> +
> +                                       /*
> +                                        * COARSE WRITE LEVEL:
> +                                        * check that we're on the correct clock edge
> +                                        */
> +
> +                                       /* hte reconfiguration request */
> +                                       mrc_params->hte_setup = 1;
> +
> +                                       /* start CRS_WR_LVL with WDQS = WDQS + 128 PI */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               delay[bl] = get_wdqs(ch, rk, bl) + FULL_CLK;
> +                                               set_wdqs(ch, rk, bl, delay[bl]);
> +                                               /*
> +                                                * program WDQ timings based on WDQS
> +                                                * (WDQ = WDQS - 32 PI)
> +                                                */
> +                                               set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK));
> +                                       }
> +
> +                                       /* get an address in the targeted channel/rank */
> +                                       address = get_addr(ch, rk);
> +                                       do {
> +                                               uint32_t coarse_result = 0x00;
> +                                               uint32_t coarse_result_mask = byte_lane_mask(mrc_params);
> +                                               /* assume pass */
> +                                               all_edges_found = true;
> +
> +                                               mrc_params->hte_setup = 1;
> +                                               coarse_result = check_rw_coarse(mrc_params, address);
> +
> +                                               /* check for failures and margin the byte lane back 128 PI (1 CLK) */
> +                                               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                                       if (coarse_result & (coarse_result_mask << bl)) {
> +                                                               all_edges_found = false;
> +                                                               delay[bl] -= FULL_CLK;
> +                                                               set_wdqs(ch, rk, bl, delay[bl]);
> +                                                               /* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */
> +                                                               set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK));
> +                                                       }
> +                                               }
> +                                       } while (!all_edges_found);
> +
> +#ifdef R2R_SHARING
> +                                       /* increment "num_ranks_enabled" */
> +                                        num_ranks_enabled++;
> +                                       /* accumulate "final_delay[][]" values from "delay[]" values for rolling average */
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               final_delay[ch][bl] += delay[bl];
> +                                               set_wdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
> +                                               /* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */
> +                                               set_wdq(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled) - QRTR_CLK);
> +                                       }
> +#endif
> +#endif
> +                               }
> +                       }
> +               }
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +void prog_page_ctrl(struct mrc_params *mrc_params)
> +{
> +       u32 dpmc0;
> +
> +       ENTERFN();
> +
> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
> +       dpmc0 &= ~(BIT16 | BIT17 | BIT18);
> +       dpmc0 |= (4 << 16);
> +       dpmc0 |= BIT21;
> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
> +}
> +
> +/*
> + * This function will perform the READ TRAINING Algorithm on all
> + * channels/ranks/byte_lanes simultaneously to minimize execution time.
> + *
> + * The idea here is to train the VREF and RDQS (and eventually RDQ) values
> + * to achieve maximum READ margins. The algorithm will first determine the
> + * X coordinate (RDQS setting). This is done by collapsing the VREF eye
> + * until we find a minimum required RDQS eye for VREF_MIN and VREF_MAX.
> + * Then we take the averages of the RDQS eye at VREF_MIN and VREF_MAX,
> + * then average those; this will be the final X coordinate. The algorithm
> + * will then determine the Y coordinate (VREF setting). This is done by
> + * collapsing the RDQS eye until we find a minimum required VREF eye for
> + * RDQS_MIN and RDQS_MAX. Then we take the averages of the VREF eye at
> + * RDQS_MIN and RDQS_MAX, then average those; this will be the final Y
> + * coordinate.
> + *
> + * NOTE: this algorithm assumes the eye curves have a one-to-one relationship,
> + * meaning for each X the curve has only one Y and vice-a-versa.
> + */
> +void rd_train(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch;     /* channel counter */
> +       uint8_t rk;     /* rank counter */
> +       uint8_t bl;     /* byte lane counter */
> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +#ifdef BACKUP_RDQS
> +#else
> +       uint8_t side_x; /* tracks LEFT/RIGHT approach vectors */
> +       uint8_t side_y; /* tracks BOTTOM/TOP approach vectors */
> +       /* X coordinate data (passing RDQS values) for approach vectors */
> +       uint8_t x_coordinate[2][2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       /* Y coordinate data (passing VREF values) for approach vectors */
> +       uint8_t y_coordinate[2][2][NUM_CHANNELS][NUM_BYTE_LANES];
> +       /* centered X (RDQS) */
> +       uint8_t x_center[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       /* centered Y (VREF) */
> +       uint8_t y_center[NUM_CHANNELS][NUM_BYTE_LANES];
> +       uint32_t address;       /* target address for check_bls_ex() */
> +       uint32_t result;        /* result of check_bls_ex() */
> +       uint32_t bl_mask;       /* byte lane mask for result checking */
> +#ifdef R2R_SHARING
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t num_ranks_enabled = 0;
> +#endif
> +#endif
> +
> +       /* rd_train starts */
> +       mrc_post_code(0x07, 0x00);
> +
> +       ENTERFN();
> +
> +#ifdef BACKUP_RDQS
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       for (bl = 0;
> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
> +                                            bl++) {
> +                                               set_rdqs(ch, rk, bl, ddr_rdqs[PLATFORM_ID]);
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +#else
> +       /* initialise x/y_coordinate arrays */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       for (bl = 0;
> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
> +                                            bl++) {
> +                                               /* x_coordinate */
> +                                               x_coordinate[L][B][ch][rk][bl] = RDQS_MIN;
> +                                               x_coordinate[R][B][ch][rk][bl] = RDQS_MAX;
> +                                               x_coordinate[L][T][ch][rk][bl] = RDQS_MIN;
> +                                               x_coordinate[R][T][ch][rk][bl] = RDQS_MAX;
> +                                               /* y_coordinate */
> +                                               y_coordinate[L][B][ch][bl] = VREF_MIN;
> +                                               y_coordinate[R][B][ch][bl] = VREF_MIN;
> +                                               y_coordinate[L][T][ch][bl] = VREF_MAX;
> +                                               y_coordinate[R][T][ch][bl] = VREF_MAX;
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +
> +       /* initialize other variables */
> +       bl_mask = byte_lane_mask(mrc_params);
> +       address = get_addr(0, 0);
> +
> +#ifdef R2R_SHARING
> +       /* need to set "final_delay[][]" elements to "0" */
> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
> +#endif
> +
> +       /* look for passing coordinates */
> +       for (side_y = B; side_y <= T; side_y++) {
> +               for (side_x = L; side_x <= R; side_x++) {
> +                       mrc_post_code(0x07, (0x10 + (side_y * 2) + (side_x)));
> +
> +                       /* find passing values */
> +                       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +                               if (mrc_params->channel_enables & (0x1 << ch)) {
> +                                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                                               if (mrc_params->rank_enables &
> +                                                       (0x1 << rk)) {
> +                                                       /* set x/y_coordinate search starting settings */
> +                                                       for (bl = 0;
> +                                                            bl < (NUM_BYTE_LANES / bl_divisor);
> +                                                            bl++) {
> +                                                               set_rdqs(ch, rk, bl,
> +                                                                        x_coordinate[side_x][side_y][ch][rk][bl]);
> +                                                               set_vref(ch, bl,
> +                                                                        y_coordinate[side_x][side_y][ch][bl]);
> +                                                       }
> +
> +                                                       /* get an address in the target channel/rank */
> +                                                       address = get_addr(ch, rk);
> +
> +                                                       /* request HTE reconfiguration */
> +                                                       mrc_params->hte_setup = 1;
> +
> +                                                       /* test the settings */
> +                                                       do {
> +                                                               /* result[07:00] == failing byte lane (MAX 8) */
> +                                                               result = check_bls_ex(mrc_params, address);
> +
> +                                                               /* check for failures */
> +                                                               if (result & 0xFF) {
> +                                                                       /* at least 1 byte lane failed */

I'm pretty sure this block can go in a function

> +                                                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                                                               if (result &
> +                                                                                       (bl_mask << bl)) {
> +                                                                                       /* adjust the RDQS values accordingly */
> +                                                                                       if (side_x == L)
> +                                                                                               x_coordinate[L][side_y][ch][rk][bl] += RDQS_STEP;
> +                                                                                       else
> +                                                                                               x_coordinate[R][side_y][ch][rk][bl] -= RDQS_STEP;
> +
> +                                                                                       /* check that we haven't closed the RDQS_EYE too much */
> +                                                                                       if ((x_coordinate[L][side_y][ch][rk][bl] > (RDQS_MAX - MIN_RDQS_EYE)) ||
> +                                                                                               (x_coordinate[R][side_y][ch][rk][bl] < (RDQS_MIN + MIN_RDQS_EYE)) ||
> +                                                                                               (x_coordinate[L][side_y][ch][rk][bl] ==
> +                                                                                               x_coordinate[R][side_y][ch][rk][bl])) {
> +                                                                                               /*
> +                                                                                                * not enough RDQS margin available at this VREF
> +                                                                                                * update VREF values accordingly
> +                                                                                                */
> +                                                                                               if (side_y == B)
> +                                                                                                       y_coordinate[side_x][B][ch][bl] += VREF_STEP;
> +                                                                                               else
> +                                                                                                       y_coordinate[side_x][T][ch][bl] -= VREF_STEP;
> +
> +                                                                                               /* check that we haven't closed the VREF_EYE too much */
> +                                                                                               if ((y_coordinate[side_x][B][ch][bl] > (VREF_MAX - MIN_VREF_EYE)) ||
> +                                                                                                       (y_coordinate[side_x][T][ch][bl] < (VREF_MIN + MIN_VREF_EYE)) ||
> +                                                                                                       (y_coordinate[side_x][B][ch][bl] == y_coordinate[side_x][T][ch][bl])) {
> +                                                                                                       /* VREF_EYE collapsed below MIN_VREF_EYE */
> +                                                                                                       training_message(ch, rk, bl);
> +                                                                                                       mrc_post_code(0xEE, (0x70 + (side_y * 2) + (side_x)));
> +                                                                                               } else {
> +                                                                                                       /* update the VREF setting */
> +                                                                                                       set_vref(ch, bl, y_coordinate[side_x][side_y][ch][bl]);
> +                                                                                                       /* reset the X coordinate to begin the search at the new VREF */
> +                                                                                                       x_coordinate[side_x][side_y][ch][rk][bl] =
> +                                                                                                               (side_x == L) ? (RDQS_MIN) : (RDQS_MAX);
> +                                                                                               }
> +                                                                                       }
> +
> +                                                                                       /* update the RDQS setting */
> +                                                                                       set_rdqs(ch, rk, bl, x_coordinate[side_x][side_y][ch][rk][bl]);
> +                                                                               }
> +                                                                       }
> +                                                               }
> +                                                       } while (result & 0xFF);
> +                                               }
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +
> +       mrc_post_code(0x07, 0x20);
> +
> +       /* find final RDQS (X coordinate) & final VREF (Y coordinate) */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               uint32_t temp1;
> +                                               uint32_t temp2;
> +
> +                                               /* x_coordinate */
> +                                               DPF(D_INFO,
> +                                                   "RDQS T/B eye rank%d lane%d : %d-%d %d-%d\n",
> +                                                   rk, bl,
> +                                                   x_coordinate[L][T][ch][rk][bl],
> +                                                   x_coordinate[R][T][ch][rk][bl],
> +                                                   x_coordinate[L][B][ch][rk][bl],
> +                                                   x_coordinate[R][B][ch][rk][bl]);
> +
> +                                               /* average the TOP side LEFT & RIGHT values */
> +                                               temp1 = (x_coordinate[R][T][ch][rk][bl] + x_coordinate[L][T][ch][rk][bl]) / 2;
> +                                               /* average the BOTTOM side LEFT & RIGHT values */
> +                                               temp2 = (x_coordinate[R][B][ch][rk][bl] + x_coordinate[L][B][ch][rk][bl]) / 2;
> +                                               /* average the above averages */
> +                                               x_center[ch][rk][bl] = (uint8_t) ((temp1 + temp2) / 2);
> +
> +                                               /* y_coordinate */
> +                                               DPF(D_INFO,
> +                                                   "VREF R/L eye lane%d : %d-%d %d-%d\n",
> +                                                   bl,
> +                                                   y_coordinate[R][B][ch][bl],
> +                                                   y_coordinate[R][T][ch][bl],
> +                                                   y_coordinate[L][B][ch][bl],
> +                                                   y_coordinate[L][T][ch][bl]);
> +
> +                                               /* average the RIGHT side TOP & BOTTOM values */
> +                                               temp1 = (y_coordinate[R][T][ch][bl] + y_coordinate[R][B][ch][bl]) / 2;
> +                                               /* average the LEFT side TOP & BOTTOM values */
> +                                               temp2 = (y_coordinate[L][T][ch][bl] + y_coordinate[L][B][ch][bl]) / 2;
> +                                               /* average the above averages */
> +                                               y_center[ch][bl] = (uint8_t) ((temp1 + temp2) / 2);
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +
> +#ifdef RX_EYE_CHECK
> +       /* perform an eye check */
> +       for (side_y = B; side_y <= T; side_y++) {
> +               for (side_x = L; side_x <= R; side_x++) {
> +                       mrc_post_code(0x07, (0x30 + (side_y * 2) + (side_x)));
> +
> +                       /* update the settings for the eye check */
> +                       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +                               if (mrc_params->channel_enables & (1 << ch)) {
> +                                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                                               if (side_x == L)
> +                                                                       set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] - (MIN_RDQS_EYE / 2)));
> +                                                               else
> +                                                                       set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] + (MIN_RDQS_EYE / 2)));
> +
> +                                                               if (side_y == B)
> +                                                                       set_vref(ch, bl, (y_center[ch][bl] - (MIN_VREF_EYE / 2)));
> +                                                               else
> +                                                                       set_vref(ch, bl, (y_center[ch][bl] + (MIN_VREF_EYE / 2)));
> +                                                       }
> +                                               }
> +                                       }
> +                               }
> +                       }
> +
> +                       /* request HTE reconfiguration */
> +                       mrc_params->hte_setup = 1;
> +
> +                       /* check the eye */
> +                       if (check_bls_ex(mrc_params, address) & 0xFF) {
> +                               /* one or more byte lanes failed */
> +                               mrc_post_code(0xEE, (0x74 + (side_x * 2) + (side_y)));
> +                       }
> +               }
> +       }
> +#endif
> +
> +       mrc_post_code(0x07, 0x40);
> +
> +       /* set final placements */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +#ifdef R2R_SHARING
> +                                       /* increment "num_ranks_enabled" */
> +                                       num_ranks_enabled++;
> +#endif
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               /* x_coordinate */
> +#ifdef R2R_SHARING
> +                                               final_delay[ch][bl] += x_center[ch][rk][bl];
> +                                               set_rdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
> +#else
> +                                               set_rdqs(ch, rk, bl, x_center[ch][rk][bl]);
> +#endif
> +                                               /* y_coordinate */
> +                                               set_vref(ch, bl, y_center[ch][bl]);
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +#endif
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will perform the WRITE TRAINING Algorithm on all
> + * channels/ranks/byte_lanes simultaneously to minimize execution time.
> + *
> + * The idea here is to train the WDQ timings to achieve maximum WRITE margins.
> + * The algorithm will start with WDQ at the current WDQ setting (tracks WDQS
> + * in WR_LVL) +/- 32 PIs (+/- 1/4 CLK) and collapse the eye until all data
> + * patterns pass. This is because WDQS will be aligned to WCLK by the
> + * Write Leveling algorithm and WDQ will only ever have a 1/2 CLK window
> + * of validity.
> + */
> +void wr_train(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch;     /* channel counter */
> +       uint8_t rk;     /* rank counter */
> +       uint8_t bl;     /* byte lane counter */
> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
> +#ifdef BACKUP_WDQ
> +#else
> +       uint8_t side;           /* LEFT/RIGHT side indicator (0=L, 1=R) */
> +       uint32_t temp;          /* temporary DWORD */
> +       /* 2 arrays, for L & R side passing delays */
> +       uint32_t delay[2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
> +       uint32_t address;       /* target address for check_bls_ex() */
> +       uint32_t result;        /* result of check_bls_ex() */
> +       uint32_t bl_mask;       /* byte lane mask for result checking */
> +#ifdef R2R_SHARING
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
> +       /* used to find placement for rank2rank sharing configs */
> +       uint32_t num_ranks_enabled = 0;
> +#endif
> +#endif
> +
> +       /* wr_train starts */
> +       mrc_post_code(0x08, 0x00);
> +
> +       ENTERFN();
> +
> +#ifdef BACKUP_WDQ
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       for (bl = 0;
> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
> +                                            bl++) {
> +                                               set_wdq(ch, rk, bl, ddr_wdq[PLATFORM_ID]);
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +#else
> +       /* initialise "delay" */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +                                       for (bl = 0;
> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
> +                                            bl++) {
> +                                               /*
> +                                                * want to start with
> +                                                * WDQ = (WDQS - QRTR_CLK)
> +                                                * +/- QRTR_CLK
> +                                                */
> +                                               temp = get_wdqs(ch, rk, bl) - QRTR_CLK;
> +                                               delay[L][ch][rk][bl] = temp - QRTR_CLK;
> +                                               delay[R][ch][rk][bl] = temp + QRTR_CLK;
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +
> +       /* initialise other variables */
> +       bl_mask = byte_lane_mask(mrc_params);
> +       address = get_addr(0, 0);
> +
> +#ifdef R2R_SHARING
> +       /* need to set "final_delay[][]" elements to "0" */
> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
> +#endif
> +
> +       /*
> +        * start algorithm on the LEFT side and train each channel/bl
> +        * until no failures are observed, then repeat for the RIGHT side.
> +        */
> +       for (side = L; side <= R; side++) {
> +               mrc_post_code(0x08, (0x10 + (side)));
> +
> +               /* set starting values */
> +               for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +                       if (mrc_params->channel_enables & (1 << ch)) {
> +                               for (rk = 0; rk < NUM_RANKS; rk++) {
> +                                       if (mrc_params->rank_enables &
> +                                               (1 << rk)) {
> +                                               for (bl = 0;
> +                                                    bl < (NUM_BYTE_LANES / bl_divisor);
> +                                                    bl++) {
> +                                                       set_wdq(ch, rk, bl, delay[side][ch][rk][bl]);
> +                                               }
> +                                       }
> +                               }
> +                       }
> +               }
> +
> +               /* find passing values */
> +               for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +                       if (mrc_params->channel_enables & (1 << ch)) {
> +                               for (rk = 0; rk < NUM_RANKS; rk++) {
> +                                       if (mrc_params->rank_enables &
> +                                               (1 << rk)) {
> +                                               /* get an address in the target channel/rank */
> +                                               address = get_addr(ch, rk);
> +
> +                                               /* request HTE reconfiguration */
> +                                               mrc_params->hte_setup = 1;
> +
> +                                               /* check the settings */
> +                                               do {
> +                                                       /* result[07:00] == failing byte lane (MAX 8) */
> +                                                       result = check_bls_ex(mrc_params, address);
> +                                                       /* check for failures */
> +                                                       if (result & 0xFF) {
> +                                                               /* at least 1 byte lane failed */
> +                                                               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                                                       if (result &
> +                                                                               (bl_mask << bl)) {
> +                                                                               if (side == L)
> +                                                                                       delay[L][ch][rk][bl] += WDQ_STEP;
> +                                                                               else
> +                                                                                       delay[R][ch][rk][bl] -= WDQ_STEP;
> +
> +                                                                               /* check for algorithm failure */
> +                                                                               if (delay[L][ch][rk][bl] != delay[R][ch][rk][bl]) {
> +                                                                                       /*
> +                                                                                        * margin available
> +                                                                                        * update delay setting
> +                                                                                        */
> +                                                                                       set_wdq(ch, rk, bl,
> +                                                                                               delay[side][ch][rk][bl]);
> +                                                                               } else {
> +                                                                                       /*
> +                                                                                        * no margin available
> +                                                                                        * notify the user and halt
> +                                                                                        */
> +                                                                                       training_message(ch, rk, bl);
> +                                                                                       mrc_post_code(0xEE, (0x80 + side));
> +                                                                               }
> +                                                                       }
> +                                                               }
> +                                                       }
> +                                               /* stop when all byte lanes pass */
> +                                               } while (result & 0xFF);
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +
> +       /* program WDQ to the middle of passing window */
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               if (mrc_params->channel_enables & (1 << ch)) {
> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
> +                               if (mrc_params->rank_enables & (1 << rk)) {
> +#ifdef R2R_SHARING
> +                                       /* increment "num_ranks_enabled" */
> +                                       num_ranks_enabled++;
> +#endif
> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
> +                                               DPF(D_INFO,
> +                                                   "WDQ eye rank%d lane%d : %d-%d\n",
> +                                                   rk, bl,
> +                                                   delay[L][ch][rk][bl],
> +                                                   delay[R][ch][rk][bl]);
> +
> +                                               temp = (delay[R][ch][rk][bl] + delay[L][ch][rk][bl]) / 2;
> +
> +#ifdef R2R_SHARING
> +                                               final_delay[ch][bl] += temp;
> +                                               set_wdq(ch, rk, bl,
> +                                                       ((final_delay[ch][bl]) / num_ranks_enabled));
> +#else
> +                                               set_wdq(ch, rk, bl, temp);
> +#endif
> +                                       }
> +                               }
> +                       }
> +               }
> +       }
> +#endif
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * This function will store relevant timing data
> + *
> + * This data will be used on subsequent boots to speed up boot times
> + * and is required for Suspend To RAM capabilities.
> + */
> +void store_timings(struct mrc_params *mrc_params)
> +{
> +       uint8_t ch, rk, bl;
> +       struct mrc_timings *mt = &mrc_params->timings;
> +
> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
> +               for (rk = 0; rk < NUM_RANKS; rk++) {
> +                       for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
> +                               mt->rcvn[ch][rk][bl] = get_rcvn(ch, rk, bl);
> +                               mt->rdqs[ch][rk][bl] = get_rdqs(ch, rk, bl);
> +                               mt->wdqs[ch][rk][bl] = get_wdqs(ch, rk, bl);
> +                               mt->wdq[ch][rk][bl] = get_wdq(ch, rk, bl);
> +
> +                               if (rk == 0)
> +                                       mt->vref[ch][bl] = get_vref(ch, bl);
> +                       }
> +
> +                       mt->wctl[ch][rk] = get_wctl(ch, rk);
> +               }
> +
> +               mt->wcmd[ch] = get_wcmd(ch);
> +       }
> +
> +       /* need to save for a case of changing frequency after warm reset */
> +       mt->ddr_speed = mrc_params->ddr_speed;
> +}
> +
> +/*
> + * The purpose of this function is to ensure the SEC comes out of reset
> + * and IA initiates the SEC enabling Memory Scrambling.
> + */
> +void enable_scrambling(struct mrc_params *mrc_params)
> +{
> +       uint32_t lfsr = 0;
> +       uint8_t i;
> +
> +       if (mrc_params->scrambling_enables == 0)
> +               return;
> +
> +       ENTERFN();
> +
> +       /* 32 bit seed is always stored in BIOS NVM */
> +       lfsr = mrc_params->timings.scrambler_seed;
> +
> +       if (mrc_params->boot_mode == BM_COLD) {
> +               /*
> +                * factory value is 0 and in first boot,
> +                * a clock based seed is loaded.
> +                */
> +               if (lfsr == 0) {
> +                       /*
> +                        * get seed from system clock
> +                        * and make sure it is not all 1's
> +                        */
> +                       lfsr = rdtsc() & 0x0FFFFFFF;
> +               } else {
> +                       /*
> +                        * Need to replace scrambler
> +                        *
> +                        * get next 32bit LFSR 16 times which is the last
> +                        * part of the previous scrambler vector
> +                        */
> +                       for (i = 0; i < 16; i++)
> +                               lfsr32(&lfsr);
> +               }
> +
> +               /* save new seed */
> +               mrc_params->timings.scrambler_seed = lfsr;
> +       }
> +
> +       /*
> +        * In warm boot or S3 exit, we have the previous seed.
> +        * In cold boot, we have the last 32bit LFSR which is the new seed.
> +        */
> +       lfsr32(&lfsr);  /* shift to next value */
> +       msg_port_write(MEM_CTLR, SCRMSEED, (lfsr & 0x0003FFFF));
> +
> +       for (i = 0; i < 2; i++)
> +               msg_port_write(MEM_CTLR, SCRMLO + i, (lfsr & 0xAAAAAAAA));
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Configure MCU Power Management Control Register
> + * and Scheduler Control Register
> + */
> +void prog_ddr_control(struct mrc_params *mrc_params)
> +{
> +       u32 dsch;
> +       u32 dpmc0;
> +
> +       ENTERFN();
> +
> +       dsch = msg_port_read(MEM_CTLR, DSCH);
> +       dsch &= ~(BIT8 | BIT9 | BIT12);
> +       msg_port_write(MEM_CTLR, DSCH, dsch);
> +
> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
> +       dpmc0 &= ~BIT25;
> +       dpmc0 |= (mrc_params->power_down_disable << 25);
> +       dpmc0 &= ~BIT24;
> +       dpmc0 &= ~(BIT16 | BIT17 | BIT18);
> +       dpmc0 |= (4 << 16);
> +       dpmc0 |= BIT21;
> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
> +
> +       /* CMDTRIST = 2h - CMD/ADDR are tristated when no valid command */
> +       mrc_write_mask(MEM_CTLR, DPMC1, 2 << 4, BIT4 | BIT5);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * After training complete configure MCU Rank Population Register
> + * specifying: ranks enabled, device width, density, address mode
> + */
> +void prog_dra_drb(struct mrc_params *mrc_params)
> +{
> +       u32 drp;
> +       u32 dco;
> +       u8 density = mrc_params->params.density;
> +
> +       ENTERFN();
> +
> +       dco = msg_port_read(MEM_CTLR, DCO);
> +       dco &= ~BIT31;
> +       msg_port_write(MEM_CTLR, DCO, dco);
> +
> +       drp = 0;
> +       if (mrc_params->rank_enables & 1)
> +               drp |= BIT0;
> +       if (mrc_params->rank_enables & 2)
> +               drp |= BIT1;
> +       if (mrc_params->dram_width == X16) {
> +               drp |= (1 << 4);
> +               drp |= (1 << 9);
> +       }
> +
> +       /*
> +        * Density encoding in struct dram_params: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb
> +        * has to be mapped RANKDENSx encoding (0=1Gb)
> +        */
> +       if (density == 0)
> +               density = 4;
> +
> +       drp |= ((density - 1) << 6);
> +       drp |= ((density - 1) << 11);
> +
> +       /* Address mode can be overwritten if ECC enabled */
> +       drp |= (mrc_params->address_mode << 14);
> +
> +       msg_port_write(MEM_CTLR, DRP, drp);
> +
> +       dco &= ~BIT28;
> +       dco |= BIT31;
> +       msg_port_write(MEM_CTLR, DCO, dco);
> +
> +       LEAVEFN();
> +}
> +
> +/* Send DRAM wake command */
> +void perform_wake(struct mrc_params *mrc_params)
> +{
> +       ENTERFN();
> +
> +       dram_wake_command();
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Configure refresh rate and short ZQ calibration interval
> + * Activate dynamic self refresh
> + */
> +void change_refresh_period(struct mrc_params *mrc_params)
> +{
> +       u32 drfc;
> +       u32 dcal;
> +       u32 dpmc0;
> +
> +       ENTERFN();
> +
> +       drfc = msg_port_read(MEM_CTLR, DRFC);
> +       drfc &= ~(BIT12 | BIT13 | BIT14);
> +       drfc |= (mrc_params->refresh_rate << 12);
> +       drfc |= BIT21;
> +       msg_port_write(MEM_CTLR, DRFC, drfc);
> +
> +       dcal = msg_port_read(MEM_CTLR, DCAL);
> +       dcal &= ~(BIT8 | BIT9 | BIT10);
> +       dcal |= (3 << 8);       /* 63ms */
> +       msg_port_write(MEM_CTLR, DCAL, dcal);
> +
> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
> +       dpmc0 |= (BIT23 | BIT29);
> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Configure DDRPHY for Auto-Refresh, Periodic Compensations,
> + * Dynamic Diff-Amp, ZQSPERIOD, Auto-Precharge, CKE Power-Down
> + */
> +void set_auto_refresh(struct mrc_params *mrc_params)
> +{
> +       uint32_t channel;
> +       uint32_t rank;
> +       uint32_t bl;
> +       uint32_t bl_divisor = 1;
> +       uint32_t temp;
> +
> +       ENTERFN();
> +
> +       /*
> +        * Enable Auto-Refresh, Periodic Compensations, Dynamic Diff-Amp,
> +        * ZQSPERIOD, Auto-Precharge, CKE Power-Down
> +        */
> +       for (channel = 0; channel < NUM_CHANNELS; channel++) {
> +               if (mrc_params->channel_enables & (1 << channel)) {
> +                       /* Enable Periodic RCOMPS */
> +                       mrc_alt_write_mask(DDRPHY, CMPCTRL, BIT1, BIT1);
> +
> +                       /* Enable Dynamic DiffAmp & Set Read ODT Value */
> +                       switch (mrc_params->rd_odt_value) {
> +                       case 0:
> +                               temp = 0x3F;    /* OFF */
> +                               break;
> +                       default:
> +                               temp = 0x00;    /* Auto */
> +                               break;
> +                       }
> +
> +                       for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
> +                               /* Override: DIFFAMP, ODT */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B0OVRCTL + (bl * DDRIODQ_BL_OFFSET) +
> +                                       (channel * DDRIODQ_CH_OFFSET)),
> +                                       (0x00 << 16) | (temp << 10),
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 |
> +                                        BIT17 | BIT16 | BIT15 | BIT14 |
> +                                        BIT13 | BIT12 | BIT11 | BIT10));
> +
> +                               /* Override: DIFFAMP, ODT */
> +                               mrc_alt_write_mask(DDRPHY,
> +                                       (B1OVRCTL + (bl * DDRIODQ_BL_OFFSET) +
> +                                       (channel * DDRIODQ_CH_OFFSET)),
> +                                       (0x00 << 16) | (temp << 10),
> +                                       (BIT21 | BIT20 | BIT19 | BIT18 |
> +                                        BIT17 | BIT16 | BIT15 | BIT14 |
> +                                        BIT13 | BIT12 | BIT11 | BIT10));
> +                       }
> +
> +                       /* Issue ZQCS command */
> +                       for (rank = 0; rank < NUM_RANKS; rank++) {
> +                               if (mrc_params->rank_enables & (1 << rank))
> +                                       dram_init_command(DCMD_ZQCS(rank));
> +                       }
> +               }
> +       }
> +
> +       clear_pointers();
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Depending on configuration enables ECC support
> + *
> + * Available memory size is decreased, and updated with 0s
> + * in order to clear error status. Address mode 2 forced.
> + */
> +void ecc_enable(struct mrc_params *mrc_params)
> +{
> +       u32 drp;
> +       u32 dsch;
> +       u32 ecc_ctrl;
> +
> +       if (mrc_params->ecc_enables == 0)
> +               return;
> +
> +       ENTERFN();
> +
> +       /* Configuration required in ECC mode */
> +       drp = msg_port_read(MEM_CTLR, DRP);
> +       drp &= ~(BIT14 | BIT15);
> +       drp |= BIT15;
> +       drp |= BIT13;
> +       msg_port_write(MEM_CTLR, DRP, drp);
> +
> +       /* Disable new request bypass */
> +       dsch = msg_port_read(MEM_CTLR, DSCH);
> +       dsch |= BIT12;
> +       msg_port_write(MEM_CTLR, DSCH, dsch);
> +
> +       /* Enable ECC */
> +       ecc_ctrl = (BIT0 | BIT1 | BIT17);
> +       msg_port_write(MEM_CTLR, DECCCTRL, ecc_ctrl);
> +
> +       /* Assume 8 bank memory, one bank is gone for ECC */
> +       mrc_params->mem_size -= mrc_params->mem_size / 8;
> +
> +       /* For S3 resume memory content has to be preserved */
> +       if (mrc_params->boot_mode != BM_S3) {
> +               select_hte();
> +               hte_mem_init(mrc_params, MRC_MEM_INIT);
> +               select_mem_mgr();
> +       }
> +
> +       LEAVEFN();
> +}
> +
> +/*
> + * Execute memory test
> + * if error detected it is indicated in mrc_params->status
> + */
> +void memory_test(struct mrc_params *mrc_params)
> +{
> +       uint32_t result = 0;
> +
> +       ENTERFN();
> +
> +       select_hte();
> +       result = hte_mem_init(mrc_params, MRC_MEM_TEST);
> +       select_mem_mgr();
> +
> +       DPF(D_INFO, "Memory test result %x\n", result);
> +       mrc_params->status = ((result == 0) ? MRC_SUCCESS : MRC_E_MEMTEST);
> +       LEAVEFN();
> +}
> +
> +/* Lock MCU registers at the end of initialization sequence */
> +void lock_registers(struct mrc_params *mrc_params)
> +{
> +       u32 dco;
> +
> +       ENTERFN();
> +
> +       dco = msg_port_read(MEM_CTLR, DCO);
> +       dco &= ~(BIT28 | BIT29);
> +       dco |= (BIT0 | BIT8);
> +       msg_port_write(MEM_CTLR, DCO, dco);
> +
> +       LEAVEFN();
> +}
> diff --git a/arch/x86/cpu/quark/smc.h b/arch/x86/cpu/quark/smc.h
> new file mode 100644
> index 0000000..f774cb3
> --- /dev/null
> +++ b/arch/x86/cpu/quark/smc.h
> @@ -0,0 +1,446 @@
> +/*
> + * Copyright (C) 2013, Intel Corporation
> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
> + *
> + * Ported from Intel released Quark UEFI BIOS
> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
> + *
> + * SPDX-License-Identifier:    Intel
> + */
> +
> +#ifndef _SMC_H_
> +#define _SMC_H_
> +
> +/* System Memory Controller Register Defines */
> +
> +/* Memory Controller Message Bus Registers Offsets */
> +#define DRP                    0x00
> +#define DTR0                   0x01
> +#define DTR1                   0x02
> +#define DTR2                   0x03
> +#define DTR3                   0x04
> +#define DTR4                   0x05
> +#define DPMC0                  0x06
> +#define DPMC1                  0x07
> +#define DRFC                   0x08
> +#define DSCH                   0x09
> +#define DCAL                   0x0A
> +#define DRMC                   0x0B
> +#define PMSTS                  0x0C
> +#define DCO                    0x0F
> +#define DSTAT                  0x20
> +#define SSKPD0                 0x4A
> +#define SSKPD1                 0x4B
> +#define DECCCTRL               0x60
> +#define DECCSTAT               0x61
> +#define DECCSBECNT             0x62
> +#define DECCSBECA              0x68
> +#define DECCSBECS              0x69
> +#define DECCDBECA              0x6A
> +#define DECCDBECS              0x6B
> +#define DFUSESTAT              0x70
> +#define SCRMSEED               0x80
> +#define SCRMLO                 0x81
> +#define SCRMHI                 0x82
> +
> +/* DRAM init command */
> +#define DCMD_MRS1(rnk, dat)    (0 | ((rnk) << 22) | (1 << 3) | ((dat) << 6))
> +#define DCMD_REF(rnk)          (1 | ((rnk) << 22))
> +#define DCMD_PRE(rnk)          (2 | ((rnk) << 22))
> +#define DCMD_PREA(rnk)         (2 | ((rnk) << 22) | (BIT10 << 6))
> +#define DCMD_ACT(rnk, row)     (3 | ((rnk) << 22) | ((row) << 6))
> +#define DCMD_WR(rnk, col)      (4 | ((rnk) << 22) | ((col) << 6))
> +#define DCMD_RD(rnk, col)      (5 | ((rnk) << 22) | ((col) << 6))
> +#define DCMD_ZQCS(rnk)         (6 | ((rnk) << 22))
> +#define DCMD_ZQCL(rnk)         (6 | ((rnk) << 22) | (BIT10 << 6))
> +#define DCMD_NOP(rnk)          (7 | ((rnk) << 22))

We should have a #define for the 22 and a #define for the 6, and
probably an enum for the 0, 1, 2, .. 7.

Then the C code should ideally do:

ENUM_NAME | (rnk << DCMD_XXX_SHIFT) | (col << DCMD_SHFIT)

instead of

DCMD_RD(rnk, col)

> +
> +#define DDR3_EMRS1_DIC_40      (0)
> +#define DDR3_EMRS1_DIC_34      (1)
> +
> +#define DDR3_EMRS1_RTTNOM_0    (0)
> +#define DDR3_EMRS1_RTTNOM_60   (BIT2)
> +#define DDR3_EMRS1_RTTNOM_120  (BIT6)
> +#define DDR3_EMRS1_RTTNOM_40   (BIT6 | BIT2)
> +#define DDR3_EMRS1_RTTNOM_20   (BIT9)
> +#define DDR3_EMRS1_RTTNOM_30   (BIT9 | BIT2)

Let's right out the value here

> +
> +#define DDR3_EMRS2_RTTWR_60    (BIT9)

(1 << 9)

> +#define DDR3_EMRS2_RTTWR_120   (BIT10)

(1 << 10)

> +
> +/* BEGIN DDRIO Registers */
> +
> +/* DDR IOs & COMPs */
> +#define DDRIODQ_BL_OFFSET      0x0800
> +#define DDRIODQ_CH_OFFSET      ((NUM_BYTE_LANES / 2) * DDRIODQ_BL_OFFSET)
> +#define DDRIOCCC_CH_OFFSET     0x0800
> +#define DDRCOMP_CH_OFFSET      0x0100
> +
> +/* CH0-BL01-DQ */
> +#define DQOBSCKEBBCTL          0x0000

Are these accessed through the msg_port? If not, we could use a struct.

> +#define DQDLLTXCTL             0x0004
> +#define DQDLLRXCTL             0x0008
> +#define DQMDLLCTL              0x000C
> +#define B0RXIOBUFCTL           0x0010
> +#define B0VREFCTL              0x0014
> +#define B0RXOFFSET1            0x0018
> +#define B0RXOFFSET0            0x001C
> +#define B1RXIOBUFCTL           0x0020
> +#define B1VREFCTL              0x0024
> +#define B1RXOFFSET1            0x0028
> +#define B1RXOFFSET0            0x002C
> +#define DQDFTCTL               0x0030
> +#define DQTRAINSTS             0x0034
> +#define B1DLLPICODER0          0x0038
> +#define B0DLLPICODER0          0x003C
> +#define B1DLLPICODER1          0x0040
> +#define B0DLLPICODER1          0x0044
> +#define B1DLLPICODER2          0x0048
> +#define B0DLLPICODER2          0x004C
> +#define B1DLLPICODER3          0x0050
> +#define B0DLLPICODER3          0x0054
> +#define B1RXDQSPICODE          0x0058
> +#define B0RXDQSPICODE          0x005C
> +#define B1RXDQPICODER32                0x0060
> +#define B1RXDQPICODER10                0x0064
> +#define B0RXDQPICODER32                0x0068
> +#define B0RXDQPICODER10                0x006C
> +#define B01PTRCTL0             0x0070
> +#define B01PTRCTL1             0x0074
> +#define B01DBCTL0              0x0078
> +#define B01DBCTL1              0x007C
> +#define B0LATCTL0              0x0080
> +#define B1LATCTL0              0x0084
> +#define B01LATCTL1             0x0088
> +#define B0ONDURCTL             0x008C
> +#define B1ONDURCTL             0x0090
> +#define B0OVRCTL               0x0094
> +#define B1OVRCTL               0x0098
> +#define DQCTL                  0x009C
> +#define B0RK2RKCHGPTRCTRL      0x00A0
> +#define B1RK2RKCHGPTRCTRL      0x00A4
> +#define DQRK2RKCTL             0x00A8
> +#define DQRK2RKPTRCTL          0x00AC
> +#define B0RK2RKLAT             0x00B0
> +#define B1RK2RKLAT             0x00B4
> +#define DQCLKALIGNREG0         0x00B8
> +#define DQCLKALIGNREG1         0x00BC
> +#define DQCLKALIGNREG2         0x00C0
> +#define DQCLKALIGNSTS0         0x00C4
> +#define DQCLKALIGNSTS1         0x00C8
> +#define DQCLKGATE              0x00CC
> +#define B0COMPSLV1             0x00D0
> +#define B1COMPSLV1             0x00D4
> +#define B0COMPSLV2             0x00D8
> +#define B1COMPSLV2             0x00DC
> +#define B0COMPSLV3             0x00E0
> +#define B1COMPSLV3             0x00E4
> +#define DQVISALANECR0TOP       0x00E8
> +#define DQVISALANECR1TOP       0x00EC
> +#define DQVISACONTROLCRTOP     0x00F0
> +#define DQVISALANECR0BL                0x00F4
> +#define DQVISALANECR1BL                0x00F8
> +#define DQVISACONTROLCRBL      0x00FC
> +#define DQTIMINGCTRL           0x010C
> +
> +/* CH0-ECC */
> +#define ECCDLLTXCTL            0x2004
> +#define ECCDLLRXCTL            0x2008
> +#define ECCMDLLCTL             0x200C
> +#define ECCB1DLLPICODER0       0x2038
> +#define ECCB1DLLPICODER1       0x2040
> +#define ECCB1DLLPICODER2       0x2048
> +#define ECCB1DLLPICODER3       0x2050
> +#define ECCB01DBCTL0           0x2078
> +#define ECCB01DBCTL1           0x207C
> +#define ECCCLKALIGNREG0                0x20B8
> +#define ECCCLKALIGNREG1                0x20BC
> +#define ECCCLKALIGNREG2                0x20C0
> +
> +/* CH0-CMD */
> +#define CMDOBSCKEBBCTL         0x4800
> +#define CMDDLLTXCTL            0x4808
> +#define CMDDLLRXCTL            0x480C
> +#define CMDMDLLCTL             0x4810
> +#define CMDRCOMPODT            0x4814
> +#define CMDDLLPICODER0         0x4820
> +#define CMDDLLPICODER1         0x4824
> +#define CMDCFGREG0             0x4840
> +#define CMDPTRREG              0x4844
> +#define CMDCLKALIGNREG0                0x4850
> +#define CMDCLKALIGNREG1                0x4854
> +#define CMDCLKALIGNREG2                0x4858
> +#define CMDPMCONFIG0           0x485C
> +#define CMDPMDLYREG0           0x4860
> +#define CMDPMDLYREG1           0x4864
> +#define CMDPMDLYREG2           0x4868
> +#define CMDPMDLYREG3           0x486C
> +#define CMDPMDLYREG4           0x4870
> +#define CMDCLKALIGNSTS0                0x4874
> +#define CMDCLKALIGNSTS1                0x4878
> +#define CMDPMSTS0              0x487C
> +#define CMDPMSTS1              0x4880
> +#define CMDCOMPSLV             0x4884
> +#define CMDBONUS0              0x488C
> +#define CMDBONUS1              0x4890
> +#define CMDVISALANECR0         0x4894
> +#define CMDVISALANECR1         0x4898
> +#define CMDVISACONTROLCR       0x489C
> +#define CMDCLKGATE             0x48A0
> +#define CMDTIMINGCTRL          0x48A4
> +
> +/* CH0-CLK-CTL */
> +#define CCOBSCKEBBCTL          0x5800
> +#define CCRCOMPIO              0x5804
> +#define CCDLLTXCTL             0x5808
> +#define CCDLLRXCTL             0x580C
> +#define CCMDLLCTL              0x5810
> +#define CCRCOMPODT             0x5814
> +#define CCDLLPICODER0          0x5820
> +#define CCDLLPICODER1          0x5824
> +#define CCDDR3RESETCTL         0x5830
> +#define CCCFGREG0              0x5838
> +#define CCCFGREG1              0x5840
> +#define CCPTRREG               0x5844
> +#define CCCLKALIGNREG0         0x5850
> +#define CCCLKALIGNREG1         0x5854
> +#define CCCLKALIGNREG2         0x5858
> +#define CCPMCONFIG0            0x585C
> +#define CCPMDLYREG0            0x5860
> +#define CCPMDLYREG1            0x5864
> +#define CCPMDLYREG2            0x5868
> +#define CCPMDLYREG3            0x586C
> +#define CCPMDLYREG4            0x5870
> +#define CCCLKALIGNSTS0         0x5874
> +#define CCCLKALIGNSTS1         0x5878
> +#define CCPMSTS0               0x587C
> +#define CCPMSTS1               0x5880
> +#define CCCOMPSLV1             0x5884
> +#define CCCOMPSLV2             0x5888
> +#define CCCOMPSLV3             0x588C
> +#define CCBONUS0               0x5894
> +#define CCBONUS1               0x5898
> +#define CCVISALANECR0          0x589C
> +#define CCVISALANECR1          0x58A0
> +#define CCVISACONTROLCR                0x58A4
> +#define CCCLKGATE              0x58A8
> +#define CCTIMINGCTL            0x58AC
> +
> +/* COMP */
> +#define CMPCTRL                        0x6800
> +#define SOFTRSTCNTL            0x6804
> +#define MSCNTR                 0x6808
> +#define NMSCNTRL               0x680C
> +#define LATCH1CTL              0x6814
> +#define COMPVISALANECR0                0x681C
> +#define COMPVISALANECR1                0x6820
> +#define COMPVISACONTROLCR      0x6824
> +#define COMPBONUS0             0x6830
> +#define TCOCNTCTRL             0x683C
> +#define DQANAODTPUCTL          0x6840
> +#define DQANAODTPDCTL          0x6844
> +#define DQANADRVPUCTL          0x6848
> +#define DQANADRVPDCTL          0x684C
> +#define DQANADLYPUCTL          0x6850
> +#define DQANADLYPDCTL          0x6854
> +#define DQANATCOPUCTL          0x6858
> +#define DQANATCOPDCTL          0x685C
> +#define CMDANADRVPUCTL         0x6868
> +#define CMDANADRVPDCTL         0x686C
> +#define CMDANADLYPUCTL         0x6870
> +#define CMDANADLYPDCTL         0x6874
> +#define CLKANAODTPUCTL         0x6880
> +#define CLKANAODTPDCTL         0x6884
> +#define CLKANADRVPUCTL         0x6888
> +#define CLKANADRVPDCTL         0x688C
> +#define CLKANADLYPUCTL         0x6890
> +#define CLKANADLYPDCTL         0x6894
> +#define CLKANATCOPUCTL         0x6898
> +#define CLKANATCOPDCTL         0x689C
> +#define DQSANAODTPUCTL         0x68A0
> +#define DQSANAODTPDCTL         0x68A4
> +#define DQSANADRVPUCTL         0x68A8
> +#define DQSANADRVPDCTL         0x68AC
> +#define DQSANADLYPUCTL         0x68B0
> +#define DQSANADLYPDCTL         0x68B4
> +#define DQSANATCOPUCTL         0x68B8
> +#define DQSANATCOPDCTL         0x68BC
> +#define CTLANADRVPUCTL         0x68C8
> +#define CTLANADRVPDCTL         0x68CC
> +#define CTLANADLYPUCTL         0x68D0
> +#define CTLANADLYPDCTL         0x68D4
> +#define CHNLBUFSTATIC          0x68F0
> +#define COMPOBSCNTRL           0x68F4
> +#define COMPBUFFDBG0           0x68F8
> +#define COMPBUFFDBG1           0x68FC
> +#define CFGMISCCH0             0x6900
> +#define COMPEN0CH0             0x6904
> +#define COMPEN1CH0             0x6908
> +#define COMPEN2CH0             0x690C
> +#define STATLEGEN0CH0          0x6910
> +#define STATLEGEN1CH0          0x6914
> +#define DQVREFCH0              0x6918
> +#define CMDVREFCH0             0x691C
> +#define CLKVREFCH0             0x6920
> +#define DQSVREFCH0             0x6924
> +#define CTLVREFCH0             0x6928
> +#define TCOVREFCH0             0x692C
> +#define DLYSELCH0              0x6930
> +#define TCODRAMBUFODTCH0       0x6934
> +#define CCBUFODTCH0            0x6938
> +#define RXOFFSETCH0            0x693C
> +#define DQODTPUCTLCH0          0x6940
> +#define DQODTPDCTLCH0          0x6944
> +#define DQDRVPUCTLCH0          0x6948
> +#define DQDRVPDCTLCH0          0x694C
> +#define DQDLYPUCTLCH0          0x6950
> +#define DQDLYPDCTLCH0          0x6954
> +#define DQTCOPUCTLCH0          0x6958
> +#define DQTCOPDCTLCH0          0x695C
> +#define CMDDRVPUCTLCH0         0x6968
> +#define CMDDRVPDCTLCH0         0x696C
> +#define CMDDLYPUCTLCH0         0x6970
> +#define CMDDLYPDCTLCH0         0x6974
> +#define CLKODTPUCTLCH0         0x6980
> +#define CLKODTPDCTLCH0         0x6984
> +#define CLKDRVPUCTLCH0         0x6988
> +#define CLKDRVPDCTLCH0         0x698C
> +#define CLKDLYPUCTLCH0         0x6990
> +#define CLKDLYPDCTLCH0         0x6994
> +#define CLKTCOPUCTLCH0         0x6998
> +#define CLKTCOPDCTLCH0         0x699C
> +#define DQSODTPUCTLCH0         0x69A0
> +#define DQSODTPDCTLCH0         0x69A4
> +#define DQSDRVPUCTLCH0         0x69A8
> +#define DQSDRVPDCTLCH0         0x69AC
> +#define DQSDLYPUCTLCH0         0x69B0
> +#define DQSDLYPDCTLCH0         0x69B4
> +#define DQSTCOPUCTLCH0         0x69B8
> +#define DQSTCOPDCTLCH0         0x69BC
> +#define CTLDRVPUCTLCH0         0x69C8
> +#define CTLDRVPDCTLCH0         0x69CC
> +#define CTLDLYPUCTLCH0         0x69D0
> +#define CTLDLYPDCTLCH0         0x69D4
> +#define FNLUPDTCTLCH0          0x69F0
> +
> +/* PLL */
> +#define MPLLCTRL0              0x7800
> +#define MPLLCTRL1              0x7808
> +#define MPLLCSR0               0x7810
> +#define MPLLCSR1               0x7814
> +#define MPLLCSR2               0x7820
> +#define MPLLDFT                        0x7828
> +#define MPLLMON0CTL            0x7830
> +#define MPLLMON1CTL            0x7838
> +#define MPLLMON2CTL            0x783C
> +#define SFRTRIM                        0x7850
> +#define MPLLDFTOUT0            0x7858
> +#define MPLLDFTOUT1            0x785C
> +#define MASTERRSTN             0x7880
> +#define PLLLOCKDEL             0x7884
> +#define SFRDEL                 0x7888
> +#define CRUVISALANECR0         0x78F0
> +#define CRUVISALANECR1         0x78F4
> +#define CRUVISACONTROLCR       0x78F8
> +#define IOSFVISALANECR0                0x78FC
> +#define IOSFVISALANECR1                0x7900
> +#define IOSFVISACONTROLCR      0x7904
> +
> +/* END DDRIO Registers */
> +
> +/* DRAM Specific Message Bus OpCodes */
> +#define MSG_OP_DRAM_INIT       0x68
> +#define MSG_OP_DRAM_WAKE       0xCA
> +
> +#define SAMPLE_SIZE            6
> +
> +/* must be less than this number to enable early deadband */
> +#define EARLY_DB               0x12
> +/* must be greater than this number to enable late deadband */
> +#define LATE_DB                        0x34
> +
> +#define CHX_REGS               (11 * 4)
> +#define FULL_CLK               128
> +#define HALF_CLK               64
> +#define QRTR_CLK               32
> +
> +#define MCEIL(num, den)                ((uint8_t)((num + den - 1) / den))
> +#define MMAX(a, b)             ((a) > (b) ? (a) : (b))
> +#define DEAD_LOOP()            for (;;);
> +
> +#define MIN_RDQS_EYE           10      /* in PI Codes */
> +#define MIN_VREF_EYE           10      /* in VREF Codes */
> +/* how many RDQS codes to jump while margining */
> +#define RDQS_STEP              1
> +/* how many VREF codes to jump while margining */
> +#define VREF_STEP              1
> +/* offset into "vref_codes[]" for minimum allowed VREF setting */
> +#define VREF_MIN               0x00
> +/* offset into "vref_codes[]" for maximum allowed VREF setting */
> +#define VREF_MAX               0x3F
> +#define RDQS_MIN               0x00    /* minimum RDQS delay value */
> +#define RDQS_MAX               0x3F    /* maximum RDQS delay value */
> +
> +/* how many WDQ codes to jump while margining */
> +#define WDQ_STEP               1
> +
> +enum {
> +       B,      /* BOTTOM VREF */
> +       T       /* TOP VREF */
> +};
> +
> +enum {
> +       L,      /* LEFT RDQS */
> +       R       /* RIGHT RDQS */
> +};
> +
> +/* Memory Options */
> +
> +/* enable STATIC timing settings for RCVN (BACKUP_MODE) */
> +#undef BACKUP_RCVN
> +/* enable STATIC timing settings for WDQS (BACKUP_MODE) */
> +#undef BACKUP_WDQS
> +/* enable STATIC timing settings for RDQS (BACKUP_MODE) */
> +#undef BACKUP_RDQS
> +/* enable STATIC timing settings for WDQ (BACKUP_MODE) */
> +#undef BACKUP_WDQ
> +/* enable *COMP overrides (BACKUP_MODE) */
> +#undef BACKUP_COMPS
> +/* enable the RD_TRAIN eye check */
> +#undef RX_EYE_CHECK
> +
> +/* enable Host to Memory Clock Alignment */
> +#define HMC_TEST
> +/* enable multi-rank support via rank2rank sharing */
> +#define R2R_SHARING
> +/* disable signals not used in 16bit mode of DDRIO */
> +#define FORCE_16BIT_DDRIO
> +
> +#define PLATFORM_ID            1
> +
> +void clear_self_refresh(struct mrc_params *mrc_params);
> +void prog_ddr_timing_control(struct mrc_params *mrc_params);
> +void prog_decode_before_jedec(struct mrc_params *mrc_params);
> +void perform_ddr_reset(struct mrc_params *mrc_params);
> +void ddrphy_init(struct mrc_params *mrc_params);
> +void perform_jedec_init(struct mrc_params *mrc_params);
> +void set_ddr_init_complete(struct mrc_params *mrc_params);
> +void restore_timings(struct mrc_params *mrc_params);
> +void default_timings(struct mrc_params *mrc_params);
> +void rcvn_cal(struct mrc_params *mrc_params);
> +void wr_level(struct mrc_params *mrc_params);
> +void prog_page_ctrl(struct mrc_params *mrc_params);
> +void rd_train(struct mrc_params *mrc_params);
> +void wr_train(struct mrc_params *mrc_params);
> +void store_timings(struct mrc_params *mrc_params);
> +void enable_scrambling(struct mrc_params *mrc_params);
> +void prog_ddr_control(struct mrc_params *mrc_params);
> +void prog_dra_drb(struct mrc_params *mrc_params);
> +void perform_wake(struct mrc_params *mrc_params);
> +void change_refresh_period(struct mrc_params *mrc_params);
> +void set_auto_refresh(struct mrc_params *mrc_params);
> +void ecc_enable(struct mrc_params *mrc_params);
> +void memory_test(struct mrc_params *mrc_params);
> +void lock_registers(struct mrc_params *mrc_params);

Function comments should go here.

> +
> +#endif /* _SMC_H_ */
> --
> 1.8.2.1
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build Bin Meng
@ 2015-02-04 16:25   ` Simon Glass
  2015-02-04 22:35     ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:25 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Turn on the Memory Reference code build in the quark Makefile.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> ---
>
>  arch/x86/cpu/quark/Makefile | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
> index 168c1e6..e87b424 100644
> --- a/arch/x86/cpu/quark/Makefile
> +++ b/arch/x86/cpu/quark/Makefile
> @@ -5,4 +5,5 @@
>  #
>
>  obj-y += car.o dram.o msg_port.o quark.o
> +obj-y += mrc.o mrc_util.o hte.o smc.o
>  obj-$(CONFIG_PCI) += pci.o

Would prefer that you do this as you add each file (i.e. in the patch
that adds the file).

Regards,
Simon

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 7/9] fdtdec: Add compatible id and string for Intel Quark MRC
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 7/9] fdtdec: Add compatible id and string for Intel Quark MRC Bin Meng
@ 2015-02-04 16:25   ` Simon Glass
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:25 UTC (permalink / raw)
  To: u-boot

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Add COMPAT_INTEL_QRK_MRC and "intel,quark-mrc" so that fdtdec can
> decode Intel Quark MRC node.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> ---
>
>  include/fdtdec.h | 1 +
>  lib/fdtdec.c     | 1 +
>  2 files changed, 2 insertions(+)

Acked-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 8/9] dt-bindings: Add Intel Quark MRC bindings
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 8/9] dt-bindings: Add Intel Quark MRC bindings Bin Meng
@ 2015-02-04 16:25   ` Simon Glass
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:25 UTC (permalink / raw)
  To: u-boot

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Add standard dt-bindings macros to be used by Intel Quark MRC node.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> ---
>
>  include/dt-bindings/mrc/quark.h | 83 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
>  create mode 100644 include/dt-bindings/mrc/quark.h

Acked-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init()
  2015-02-03 11:45 ` [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init() Bin Meng
@ 2015-02-04 16:25   ` Simon Glass
  2015-02-04 22:54     ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Simon Glass @ 2015-02-04 16:25 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Now that we have added Quark MRC codes, call MRC in dram_init() so
> that DRAM can be initialized on a Quark based board.
>
> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>
> ---
>
>  arch/x86/cpu/quark/dram.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++-
>  arch/x86/dts/galileo.dts  | 25 ++++++++++++
>  2 files changed, 120 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/cpu/quark/dram.c b/arch/x86/cpu/quark/dram.c
> index fbdc3cd..3ed1d20 100644
> --- a/arch/x86/cpu/quark/dram.c
> +++ b/arch/x86/cpu/quark/dram.c
> @@ -5,15 +5,108 @@
>   */
>
>  #include <common.h>
> +#include <errno.h>
> +#include <fdtdec.h>
>  #include <asm/post.h>
> +#include <asm/arch/mrc.h>
>  #include <asm/arch/quark.h>
>
>  DECLARE_GLOBAL_DATA_PTR;
>
> +static int mrc_configure_params(struct mrc_params *mrc_params)
> +{
> +       const void *blob = gd->fdt_blob;
> +       int node;
> +       int mrc_flags;
> +
> +       node = fdtdec_next_compatible(blob, 0, COMPAT_INTEL_QRK_MRC);
> +       if (node < 0) {
> +               debug("%s: Cannot find MRC node\n", __func__);
> +               return -EINVAL;
> +       }
> +
> +       /*
> +        * TODO:
> +        *
> +        * We need support fast boot (MRC cache) in the future.
> +        *
> +        * Set boot mode to cold boot for now
> +        */
> +       mrc_params->boot_mode = BM_COLD;
> +
> +       /*
> +        * TODO:
> +        *
> +        * We need determine ECC by pin strap state
> +        *
> +        * Disable ECC by default for now
> +        */
> +       mrc_params->ecc_enables = 0;
> +
> +       mrc_flags = fdtdec_get_int(blob, node, "flags", 0);
> +       if (mrc_flags & MRC_FLAG_SCRAMBLE_EN)
> +               mrc_params->scrambling_enables = 1;
> +       else
> +               mrc_params->scrambling_enables = 0;
> +
> +       mrc_params->dram_width = fdtdec_get_int(blob, node, "dram-width", 0);
> +       mrc_params->ddr_speed = fdtdec_get_int(blob, node, "dram-speed", 0);
> +       mrc_params->ddr_type = fdtdec_get_int(blob, node, "dram-type", 0);
> +
> +       mrc_params->rank_enables = fdtdec_get_int(blob, node, "rank-mask", 0);
> +       mrc_params->channel_enables = fdtdec_get_int(blob, node,
> +               "chan-mask", 0);
> +       mrc_params->channel_width = fdtdec_get_int(blob, node,
> +               "chan-width", 0);
> +       mrc_params->address_mode = fdtdec_get_int(blob, node, "addr-mode", 0);
> +
> +       mrc_params->refresh_rate = fdtdec_get_int(blob, node,
> +               "refresh-rate", 0);
> +       mrc_params->sr_temp_range = fdtdec_get_int(blob, node,
> +               "sr-temp-range", 0);
> +       mrc_params->ron_value = fdtdec_get_int(blob, node,
> +               "ron-value", 0);
> +       mrc_params->rtt_nom_value = fdtdec_get_int(blob, node,
> +               "rtt-nom-value", 0);
> +       mrc_params->rd_odt_value = fdtdec_get_int(blob, node,
> +               "rd-odt-value", 0);
> +
> +       mrc_params->params.density = fdtdec_get_int(blob, node,
> +               "dram-density", 0);
> +       mrc_params->params.cl = fdtdec_get_int(blob, node, "dram-cl", 0);
> +       mrc_params->params.ras = fdtdec_get_int(blob, node, "dram-ras", 0);
> +       mrc_params->params.wtr = fdtdec_get_int(blob, node, "dram-wtr", 0);
> +       mrc_params->params.rrd = fdtdec_get_int(blob, node, "dram-rrd", 0);
> +       mrc_params->params.faw = fdtdec_get_int(blob, node, "dram-faw", 0);
> +
> +       debug("MRC dram_width %d\n", mrc_params->dram_width);
> +       debug("MRC rank_enables %d\n", mrc_params->rank_enables);
> +       debug("MRC ddr_speed %d\n", mrc_params->ddr_speed);
> +       debug("MRC flags: %s\n",
> +             (mrc_params->scrambling_enables) ? "SCRAMBLE_EN" : "");
> +
> +       debug("MRC density=%d tCL=%d tRAS=%d tWTR=%d tRRD=%d tFAW=%d\n",
> +             mrc_params->params.density, mrc_params->params.cl,
> +             mrc_params->params.ras, mrc_params->params.wtr,
> +             mrc_params->params.rrd, mrc_params->params.faw);
> +
> +       return 0;
> +}
> +
>  int dram_init(void)
>  {
> -       /* hardcode the DRAM size for now */
> -       gd->ram_size = DRAM_MAX_SIZE;
> +       struct mrc_params mrc_params;
> +       int ret;
> +
> +       memset(&mrc_params, 0, sizeof(struct mrc_params));
> +       ret = mrc_configure_params(&mrc_params);
> +       if (ret)
> +               return ret;
> +
> +       /* Call MRC */

How about something like:

/* Set up the SDRAM by calling the memory reference code */

> +       mrc(&mrc_params);

Can this fail?

> +
> +       gd->ram_size = mrc_params.mem_size;
>         post_code(POST_DRAM);
>
>         return 0;
> diff --git a/arch/x86/dts/galileo.dts b/arch/x86/dts/galileo.dts
> index 14a19c3..d462221 100644
> --- a/arch/x86/dts/galileo.dts
> +++ b/arch/x86/dts/galileo.dts
> @@ -6,6 +6,8 @@
>
>  /dts-v1/;
>
> +#include <dt-bindings/mrc/quark.h>
> +
>  /include/ "skeleton.dtsi"
>
>  / {
> @@ -20,6 +22,29 @@
>                 stdout-path = &pciuart0;
>         };
>
> +       mrc {
> +               compatible = "intel,quark-mrc";
> +               flags = <MRC_FLAG_SCRAMBLE_EN>;
> +               dram-width = <DRAM_WIDTH_X8>;
> +               dram-speed = <DRAM_FREQ_800>;
> +               dram-type = <DRAM_TYPE_DDR3>;
> +               rank-mask = <DRAM_RANK(0)>;
> +               chan-mask = <DRAM_CHANNEL(0)>;
> +               chan-width = <DRAM_CHANNEL_WIDTH_X16>;
> +               addr-mode = <DRAM_ADDR_MODE0>;
> +               refresh-rate = <DRAM_REFRESH_RATE_785US>;
> +               sr-temp-range = <DRAM_SRT_RANGE_NORMAL>;
> +               ron-value = <DRAM_RON_34OHM>;
> +               rtt-nom-value = <DRAM_RTT_NOM_120OHM>;
> +               rd-odt-value = <DRAM_RD_ODT_OFF>;
> +               dram-density = <DRAM_DENSITY_1G>;
> +               dram-cl = <6>;
> +               dram-ras = <0x0000927c>;
> +               dram-wtr = <0x00002710>;
> +               dram-rrd = <0x00002710>;
> +               dram-faw = <0x00009c40>;
> +       };
> +
>         pci {
>                 #address-cells = <3>;
>                 #size-cells = <2>;
> --
> 1.8.2.1
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-04 16:25   ` Simon Glass
@ 2015-02-04 22:35     ` Bin Meng
  2015-02-05  6:58       ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-04 22:35 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, Feb 5, 2015 at 12:25 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>> Turn on the Memory Reference code build in the quark Makefile.
>>
>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>> ---
>>
>>  arch/x86/cpu/quark/Makefile | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
>> index 168c1e6..e87b424 100644
>> --- a/arch/x86/cpu/quark/Makefile
>> +++ b/arch/x86/cpu/quark/Makefile
>> @@ -5,4 +5,5 @@
>>  #
>>
>>  obj-y += car.o dram.o msg_port.o quark.o
>> +obj-y += mrc.o mrc_util.o hte.o smc.o
>>  obj-$(CONFIG_PCI) += pci.o
>
> Would prefer that you do this as you add each file (i.e. in the patch
> that adds the file).
>

OK, will squash this one to previous commits.

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init()
  2015-02-04 16:25   ` Simon Glass
@ 2015-02-04 22:54     ` Bin Meng
  0 siblings, 0 replies; 29+ messages in thread
From: Bin Meng @ 2015-02-04 22:54 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, Feb 5, 2015 at 12:25 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>> Now that we have added Quark MRC codes, call MRC in dram_init() so
>> that DRAM can be initialized on a Quark based board.
>>
>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>>
>> ---
>>
>>  arch/x86/cpu/quark/dram.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++-
>>  arch/x86/dts/galileo.dts  | 25 ++++++++++++
>>  2 files changed, 120 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/cpu/quark/dram.c b/arch/x86/cpu/quark/dram.c
>> index fbdc3cd..3ed1d20 100644
>> --- a/arch/x86/cpu/quark/dram.c
>> +++ b/arch/x86/cpu/quark/dram.c
>> @@ -5,15 +5,108 @@
>>   */
>>
>>  #include <common.h>
>> +#include <errno.h>
>> +#include <fdtdec.h>
>>  #include <asm/post.h>
>> +#include <asm/arch/mrc.h>
>>  #include <asm/arch/quark.h>
>>
>>  DECLARE_GLOBAL_DATA_PTR;
>>
>> +static int mrc_configure_params(struct mrc_params *mrc_params)
>> +{
>> +       const void *blob = gd->fdt_blob;
>> +       int node;
>> +       int mrc_flags;
>> +
>> +       node = fdtdec_next_compatible(blob, 0, COMPAT_INTEL_QRK_MRC);
>> +       if (node < 0) {
>> +               debug("%s: Cannot find MRC node\n", __func__);
>> +               return -EINVAL;
>> +       }
>> +
>> +       /*
>> +        * TODO:
>> +        *
>> +        * We need support fast boot (MRC cache) in the future.
>> +        *
>> +        * Set boot mode to cold boot for now
>> +        */
>> +       mrc_params->boot_mode = BM_COLD;
>> +
>> +       /*
>> +        * TODO:
>> +        *
>> +        * We need determine ECC by pin strap state
>> +        *
>> +        * Disable ECC by default for now
>> +        */
>> +       mrc_params->ecc_enables = 0;
>> +
>> +       mrc_flags = fdtdec_get_int(blob, node, "flags", 0);
>> +       if (mrc_flags & MRC_FLAG_SCRAMBLE_EN)
>> +               mrc_params->scrambling_enables = 1;
>> +       else
>> +               mrc_params->scrambling_enables = 0;
>> +
>> +       mrc_params->dram_width = fdtdec_get_int(blob, node, "dram-width", 0);
>> +       mrc_params->ddr_speed = fdtdec_get_int(blob, node, "dram-speed", 0);
>> +       mrc_params->ddr_type = fdtdec_get_int(blob, node, "dram-type", 0);
>> +
>> +       mrc_params->rank_enables = fdtdec_get_int(blob, node, "rank-mask", 0);
>> +       mrc_params->channel_enables = fdtdec_get_int(blob, node,
>> +               "chan-mask", 0);
>> +       mrc_params->channel_width = fdtdec_get_int(blob, node,
>> +               "chan-width", 0);
>> +       mrc_params->address_mode = fdtdec_get_int(blob, node, "addr-mode", 0);
>> +
>> +       mrc_params->refresh_rate = fdtdec_get_int(blob, node,
>> +               "refresh-rate", 0);
>> +       mrc_params->sr_temp_range = fdtdec_get_int(blob, node,
>> +               "sr-temp-range", 0);
>> +       mrc_params->ron_value = fdtdec_get_int(blob, node,
>> +               "ron-value", 0);
>> +       mrc_params->rtt_nom_value = fdtdec_get_int(blob, node,
>> +               "rtt-nom-value", 0);
>> +       mrc_params->rd_odt_value = fdtdec_get_int(blob, node,
>> +               "rd-odt-value", 0);
>> +
>> +       mrc_params->params.density = fdtdec_get_int(blob, node,
>> +               "dram-density", 0);
>> +       mrc_params->params.cl = fdtdec_get_int(blob, node, "dram-cl", 0);
>> +       mrc_params->params.ras = fdtdec_get_int(blob, node, "dram-ras", 0);
>> +       mrc_params->params.wtr = fdtdec_get_int(blob, node, "dram-wtr", 0);
>> +       mrc_params->params.rrd = fdtdec_get_int(blob, node, "dram-rrd", 0);
>> +       mrc_params->params.faw = fdtdec_get_int(blob, node, "dram-faw", 0);
>> +
>> +       debug("MRC dram_width %d\n", mrc_params->dram_width);
>> +       debug("MRC rank_enables %d\n", mrc_params->rank_enables);
>> +       debug("MRC ddr_speed %d\n", mrc_params->ddr_speed);
>> +       debug("MRC flags: %s\n",
>> +             (mrc_params->scrambling_enables) ? "SCRAMBLE_EN" : "");
>> +
>> +       debug("MRC density=%d tCL=%d tRAS=%d tWTR=%d tRRD=%d tFAW=%d\n",
>> +             mrc_params->params.density, mrc_params->params.cl,
>> +             mrc_params->params.ras, mrc_params->params.wtr,
>> +             mrc_params->params.rrd, mrc_params->params.faw);
>> +
>> +       return 0;
>> +}
>> +
>>  int dram_init(void)
>>  {
>> -       /* hardcode the DRAM size for now */
>> -       gd->ram_size = DRAM_MAX_SIZE;
>> +       struct mrc_params mrc_params;
>> +       int ret;
>> +
>> +       memset(&mrc_params, 0, sizeof(struct mrc_params));
>> +       ret = mrc_configure_params(&mrc_params);
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Call MRC */
>
> How about something like:
>
> /* Set up the SDRAM by calling the memory reference code */
>

OK.

>> +       mrc(&mrc_params);
>
> Can this fail?
>

Probably. mrc() itself is a function returning void. If you check the
'struct mrc_params', there is a 'status' member which is assigned
after memory_test() which is supposed to indicate memory is
initialized correctly or not. However during my debug, when the memory
is not initialized correctly (some mistakes during the port) and
U-Boot hangs at relocating fdt, the output message showed that the
status is 0, which I don't understand. The original Intel codes do not
check this status after calling MRC, but I think I can add a check
here.

[snip]

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-04 22:35     ` Bin Meng
@ 2015-02-05  6:58       ` Bin Meng
  2015-02-06  0:18         ` Albert ARIBAUD
  0 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-05  6:58 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, Feb 5, 2015 at 6:35 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Simon,
>
> On Thu, Feb 5, 2015 at 12:25 AM, Simon Glass <sjg@chromium.org> wrote:
>> Hi Bin,
>>
>> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>>> Turn on the Memory Reference code build in the quark Makefile.
>>>
>>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>>> ---
>>>
>>>  arch/x86/cpu/quark/Makefile | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
>>> index 168c1e6..e87b424 100644
>>> --- a/arch/x86/cpu/quark/Makefile
>>> +++ b/arch/x86/cpu/quark/Makefile
>>> @@ -5,4 +5,5 @@
>>>  #
>>>
>>>  obj-y += car.o dram.o msg_port.o quark.o
>>> +obj-y += mrc.o mrc_util.o hte.o smc.o
>>>  obj-$(CONFIG_PCI) += pci.o
>>
>> Would prefer that you do this as you add each file (i.e. in the patch
>> that adds the file).
>>
>
> OK, will squash this one to previous commits.

Sorry I was replying too fast. Looks that I cannot add each file to
Makefile each time, because it will not build until the 3rd patch is
in place to provide all header files needed.

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines
  2015-02-04 16:24   ` Simon Glass
@ 2015-02-05  8:45     ` Bin Meng
  0 siblings, 0 replies; 29+ messages in thread
From: Bin Meng @ 2015-02-05  8:45 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, Feb 5, 2015 at 12:24 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>> Add the main routines for Quark Memory Reference Code (MRC).
>>
>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>>
>> ---
>> The are 24 checkpatch warnings in this patch, which is:
>>
>> warning: arch/x86/cpu/quark/mrc.c,43: line over 80 characters
>> ...
>>
>> I intentionally leave it as is now, as fixing these warnings
>> make the mrc initialization table a little bit harder to read.
>>
>>  arch/x86/cpu/quark/mrc.c              | 206 ++++++++++++++++++++++++++++++++++
>>  arch/x86/include/asm/arch-quark/mrc.h | 189 +++++++++++++++++++++++++++++++
>>  2 files changed, 395 insertions(+)
>>  create mode 100644 arch/x86/cpu/quark/mrc.c
>>  create mode 100644 arch/x86/include/asm/arch-quark/mrc.h
>>
>> diff --git a/arch/x86/cpu/quark/mrc.c b/arch/x86/cpu/quark/mrc.c
>> new file mode 100644
>> index 0000000..6a82519
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/mrc.c
>> @@ -0,0 +1,206 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +/*
>> + * This is the main Quark Memory Reference Code (MRC)
>> + *
>> + * These functions are generic and should work for any Quark based board.
>
> Quark-based

Fixed

>> + *
>> + * MRC requires two data structures to be passed in which are initialized by
>> + * mrc_adjust_params().
>> + *
>> + * The basic flow is as follows:
>> + * 01) Check for supported DDR speed configuration
>> + * 02) Set up Memory Manager buffer as pass-through (POR)
>> + * 03) Set Channel Interleaving Mode and Channel Stride to the most aggressive
>> + *     setting possible
>> + * 04) Set up the Memory Controller logic
>> + * 05) Set up the DDR_PHY logic
>> + * 06) Initialise the DRAMs (JEDEC)
>> + * 07) Perform the Receive Enable Calibration algorithm
>> + * 08) Perform the Write Leveling algorithm
>> + * 09) Perform the Read Training algorithm (includes internal Vref)
>> + * 10) Perform the Write Training algorithm
>> + * 11) Set Channel Interleaving Mode and Channel Stride to the desired settings
>> + *
>> + * Dunit configuration based on Valleyview MRC.
>
> What is Dunit?

Fixed. DRAM unit.

>> + */
>> +
>> +#include <common.h>
>> +#include <asm/arch/mrc.h>
>> +#include <asm/arch/msg_port.h>
>> +#include "mrc_util.h"
>> +#include "smc.h"
>> +
>> +static const struct mem_init init[] = {
>> +       { 0x0101, BM_COLD | BM_FAST | BM_WARM | BM_S3, clear_self_refresh       },
>> +       { 0x0200, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_timing_control  },
>> +       { 0x0103, BM_COLD | BM_FAST                  , prog_decode_before_jedec },
>> +       { 0x0104, BM_COLD | BM_FAST                  , perform_ddr_reset        },
>> +       { 0x0300, BM_COLD | BM_FAST           | BM_S3, ddrphy_init              },
>> +       { 0x0400, BM_COLD | BM_FAST                  , perform_jedec_init       },
>> +       { 0x0105, BM_COLD | BM_FAST                  , set_ddr_init_complete    },
>> +       { 0x0106,           BM_FAST | BM_WARM | BM_S3, restore_timings          },
>> +       { 0x0106, BM_COLD                            , default_timings          },
>> +       { 0x0500, BM_COLD                            , rcvn_cal                 },
>> +       { 0x0600, BM_COLD                            , wr_level                 },
>> +       { 0x0120, BM_COLD                            , prog_page_ctrl           },
>> +       { 0x0700, BM_COLD                            , rd_train                 },
>> +       { 0x0800, BM_COLD                            , wr_train                 },
>> +       { 0x010B, BM_COLD                            , store_timings            },
>> +       { 0x010C, BM_COLD | BM_FAST | BM_WARM | BM_S3, enable_scrambling        },
>> +       { 0x010D, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_control         },
>> +       { 0x010E, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_dra_drb             },
>> +       { 0x010F,                     BM_WARM | BM_S3, perform_wake             },
>> +       { 0x0110, BM_COLD | BM_FAST | BM_WARM | BM_S3, change_refresh_period    },
>> +       { 0x0111, BM_COLD | BM_FAST | BM_WARM | BM_S3, set_auto_refresh         },
>> +       { 0x0112, BM_COLD | BM_FAST | BM_WARM | BM_S3, ecc_enable               },
>> +       { 0x0113, BM_COLD | BM_FAST                  , memory_test              },
>> +       { 0x0114, BM_COLD | BM_FAST | BM_WARM | BM_S3, lock_registers           }
>
> What are the hex codes at the start? Ah I see they are post codes (we
> don't particularly need them, I'm just asking). Should there be
> #defines for these? Also how come they use all 16 bits?
>

Looks Intel chose random numbers for these post codes. Changing them
to #define does not seem to work as I don't know how to name them. So
I keep these unchanged in v2.

>> +};
>> +
>> +/* Adjust configuration parameters before initialization sequence */
>> +static void mrc_adjust_params(struct mrc_params *mrc_params)
>> +{
>> +       const struct dram_params *dram_params;
>> +       uint8_t dram_width;
>> +       uint32_t rank_enables;
>> +       uint32_t channel_width;
>> +
>> +       ENTERFN();
>
> What is this?

Debug output for tracking function call.

>> +
>> +       /* initially expect success */
>> +       mrc_params->status = MRC_SUCCESS;
>> +
>> +       dram_width = mrc_params->dram_width;
>> +       rank_enables = mrc_params->rank_enables;
>> +       channel_width = mrc_params->channel_width;
>> +
>> +       /*
>> +        * Setup board layout (must be reviewed as is selecting static timings)
>> +        * 0 == R0 (DDR3 x16), 1 == R1 (DDR3 x16),
>> +        * 2 == DV (DDR3 x8), 3 == SV (DDR3 x8).
>> +        */
>> +       if (dram_width == X8)
>> +               mrc_params->board_id = 2;       /* select x8 layout */
>> +       else
>> +               mrc_params->board_id = 0;       /* select x16 layout */
>> +
>> +       /* initially no memory */
>> +       mrc_params->mem_size = 0;
>> +
>> +       /* begin of channel settings */
>> +       dram_params = &mrc_params->params;
>> +
>> +       /*
>> +        * Determine Column Bits:
>> +        *
>> +        * Column: 11 for 8Gbx8, else 10
>> +        */
>> +       mrc_params->column_bits[0] =
>> +               ((dram_params[0].density == 4) &&
>> +               (dram_width == X8)) ? (11) : (10);
>> +
>> +       /*
>> +        * Determine Row Bits:
>
> Can we capitalise only the first word in these comments?

Fixed.

>> +        *
>> +        * 512Mbx16=12 512Mbx8=13
>> +        * 1Gbx16=13   1Gbx8=14
>> +        * 2Gbx16=14   2Gbx8=15
>> +        * 4Gbx16=15   4Gbx8=16
>> +        * 8Gbx16=16   8Gbx8=16
>> +        */
>> +       mrc_params->row_bits[0] = 12 + (dram_params[0].density) +
>> +               (((dram_params[0].density < 4) &&
>> +               (dram_width == X8)) ? (1) : (0));
>> +
>> +       /*
>> +        * Determine Per Channel Memory Size:
>
> per-channel

Fixed.

>> +        *
>> +        * (For 2 RANKs, multiply by 2)
>> +        * (For 16 bit data bus, divide by 2)
>> +        *
>> +        * DENSITY WIDTH MEM_AVAILABLE
>> +        * 512Mb   x16   0x008000000 ( 128MB)
>> +        * 512Mb   x8    0x010000000 ( 256MB)
>> +        * 1Gb     x16   0x010000000 ( 256MB)
>> +        * 1Gb     x8    0x020000000 ( 512MB)
>> +        * 2Gb     x16   0x020000000 ( 512MB)
>> +        * 2Gb     x8    0x040000000 (1024MB)
>> +        * 4Gb     x16   0x040000000 (1024MB)
>> +        * 4Gb     x8    0x080000000 (2048MB)
>> +        */
>> +       mrc_params->channel_size[0] = (1 << dram_params[0].density);
>> +       mrc_params->channel_size[0] *= (dram_width == X8) ? (2) : (1);
>> +       mrc_params->channel_size[0] *= (rank_enables == 0x3) ? (2) : (1);
>> +       mrc_params->channel_size[0] *= (channel_width == X16) ? (1) : (2);
>
> Remove () around 2 and 1.

Fixed

>> +
>> +       /* Determine memory size (convert number of 64MB/512Mb units) */
>> +       mrc_params->mem_size += mrc_params->channel_size[0] << 26;
>> +
>> +       LEAVEFN();
>
> ?

Debug output for tracking function return.

>> +}
>> +
>> +static void mrc_init(struct mrc_params *mrc_params)
>> +{
>> +       int i;
>> +
>> +       ENTERFN();
>> +
>> +       DPF(D_INFO, "mrc_init build %s %s\n", __DATE__, __TIME__);
>
> debug() I think, and below.

DPF is the MRC debug routine, and I wanted to keep using it for MRC
codes. And in v2, I removed this line completely.

>> +
>> +       /* MRC started */
>> +       mrc_post_code(0x01, 0x00);
>> +
>> +       if (mrc_params->boot_mode != BM_COLD) {
>> +               if (mrc_params->ddr_speed != mrc_params->timings.ddr_speed) {
>> +                       /* full training required as frequency changed */
>> +                       mrc_params->boot_mode = BM_COLD;
>> +               }
>> +       }
>> +
>> +       for (i = 0; i < ARRAY_SIZE(init); i++) {
>> +               uint64_t my_tsc;
>> +
>> +               if (mrc_params->boot_mode & init[i].boot_path) {
>> +                       uint8_t major = init[i].post_code >> 8 & 0xFF;
>> +                       uint8_t minor = init[i].post_code >> 0 & 0xFF;
>
> Can we stick with lower case hex, and below?

Fixed.

>> +                       mrc_post_code(major, minor);
>> +
>> +                       my_tsc = rdtsc();
>> +                       init[i].init_fn(mrc_params);
>> +                       DPF(D_TIME, "Execution time %llx", rdtsc() - my_tsc);
>> +               }
>> +       }
>> +
>> +       /* display the timings */
>> +       print_timings(mrc_params);
>> +
>> +       /* MRC complete */
>> +       mrc_post_code(0x01, 0xFF);
>> +

Fixed, using lower case hex.

>> +       LEAVEFN();
>> +}
>> +
>> +void mrc(struct mrc_params *mrc_params)
>> +{
>> +       ENTERFN();
>> +
>> +       DPF(D_INFO, "MRC Version %04x %s %s\n",
>> +           MRC_VERSION, __DATE__, __TIME__);
>
> Can you reformat so more args on first line?

Fixed.

>> +
>> +       /* Set up the data structures used by mrc_init() */
>> +       mrc_adjust_params(mrc_params);
>> +
>> +       /* Initialize system memory */
>> +       mrc_init(mrc_params);
>> +
>> +       LEAVEFN();
>> +}
>> diff --git a/arch/x86/include/asm/arch-quark/mrc.h b/arch/x86/include/asm/arch-quark/mrc.h
>> new file mode 100644
>> index 0000000..690a800
>> --- /dev/null
>> +++ b/arch/x86/include/asm/arch-quark/mrc.h
>> @@ -0,0 +1,189 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#ifndef _MRC_H_
>> +#define _MRC_H_
>> +
>> +/* MRC Version */
>
> I think you can drop that comment!

Fixed.

>> +#define MRC_VERSION    0x0111
>> +
>> +/* architectural definitions */
>> +#define NUM_CHANNELS   1       /* number of channels */
>> +#define NUM_RANKS      2       /* number of ranks per channel */
>> +#define NUM_BYTE_LANES 4       /* number of byte lanes per channel */
>> +
>> +/* software limitations */
>> +#define MAX_CHANNELS   1
>> +#define MAX_RANKS      2
>> +#define MAX_BYTE_LANES 4
>> +
>> +/* only to mock MrcWrapper */
>
> What does this mean?

I don't know, just removed this line in v2.

>> +#define MAX_SOCKETS    1
>> +#define MAX_SIDES      1
>> +#define MAX_ROWS       (MAX_SIDES * MAX_SOCKETS)
>> +
>> +/* Specify DRAM of nenory channel width */
>
> memory
>
> Also this doesn't quite make sense - can you please reword it?

Changed the comment to: /* Specify DRAM and channel width */ in v2.

>> +enum {
>> +       X8,     /* DRAM width */
>> +       X16,    /* DRAM width & Channel Width */
>> +       X32     /* Channel Width */
>> +};
>> +
>> +/* Specify DRAM speed */
>> +enum {
>> +       DDRFREQ_800,
>> +       DDRFREQ_1066
>> +};
>> +
>> +/* Specify DRAM type */
>> +enum {
>> +       DDR3,
>> +       DDR3L
>> +};
>> +
>> +/*
>> + * density: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb
>
> should either have @density in this header and all the others here
> too. Or move this comment below above density.

Fixed.

>> + * cl is DRAM CAS Latency in clocks
>> + * All other timings are in picoseconds
>> + *
>> + * Refer to JEDEC spec (or DRAM datasheet) when changing these values.
>> + */
>> +struct dram_params {
>> +       uint8_t density;
>> +       /* CAS latency in clocks */
>> +       uint8_t cl;
>> +       /* ACT to PRE command period */
>> +       uint32_t ras;
>> +       /*
>> +        * Delay from start of internal write transaction to
>> +        * internal read command
>> +        */
>> +       uint32_t wtr;
>> +       /* ACT to ACT command period (JESD79 specific to page size 1K/2K) */
>> +       uint32_t rrd;
>> +       /* Four activate window (JESD79 specific to page size 1K/2K) */
>> +       uint32_t faw;
>> +};
>> +
>> +/*
>> + * Delay configuration for individual signals
>> + * Vref setting
>> + * Scrambler seed
>
> What do the above two lines mean?

I think this is DDR technology term. I did not change this in v2.

>> + */
>> +struct mrc_timings {
>> +       uint32_t rcvn[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       uint32_t rdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       uint32_t wdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       uint32_t wdq[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       uint32_t vref[NUM_CHANNELS][NUM_BYTE_LANES];
>> +       uint32_t wctl[NUM_CHANNELS][NUM_RANKS];
>> +       uint32_t wcmd[NUM_CHANNELS];
>> +       uint32_t scrambler_seed;
>
> Comments for the above?

Again too DDR-specific terms. I did not add any comment to this.

>> +       /* need to save for the case of frequency change */
>> +       uint8_t ddr_speed;
>> +};
>> +
>> +/* Boot mode defined as bit mask (1<<n) */
>> +enum {
>> +       BM_UNKNOWN,
>> +       BM_COLD = 1,    /* full training */
>> +       BM_FAST = 2,    /* restore timing parameters */
>> +       BM_S3   = 4,    /* resume from S3 */
>> +       BM_WARM = 8
>> +};
>> +
>> +/* MRC execution status */
>> +#define MRC_SUCCESS    0       /* initialization ok */
>> +#define MRC_E_MEMTEST  1       /* memtest failed */
>> +
>> +/* Input/output/context parameters for Memory Reference Code */
>> +struct mrc_params {
>> +       /* Global Settings */
>> +
>> +       /* BM_COLD, BM_FAST, BM_WARM, BM_S3 */
>> +       uint32_t boot_mode;
>> +       uint8_t first_run;
>> +
>> +       /* DRAM Parameters */
>> +
>
> Remove blank line

Fixed.

>> +       uint8_t dram_width;             /* x8, x16 */
>> +       uint8_t ddr_speed;              /* DDRFREQ_800, DDRFREQ_1066 */
>> +       uint8_t ddr_type;               /* DDR3, DDR3L */
>> +       uint8_t ecc_enables;            /* 0, 1 (memory size reduced to 7/8) */
>> +       uint8_t scrambling_enables;     /* 0, 1 */
>> +       /* 1, 3 (1'st rank has to be populated if 2'nd rank present) */
>> +       uint32_t rank_enables;
>> +       uint32_t channel_enables;       /* 1 only */
>> +       uint32_t channel_width;         /* x16 only */
>> +       /* 0, 1, 2 (mode 2 forced if ecc enabled) */
>> +       uint32_t address_mode;
>> +       /* REFRESH_RATE: 1=1.95us, 2=3.9us, 3=7.8us, others=RESERVED */
>> +       uint8_t refresh_rate;
>> +       /* SR_TEMP_RANGE: 0=normal, 1=extended, others=RESERVED */
>> +       uint8_t sr_temp_range;
>> +       /*
>> +        * RON_VALUE: 0=34ohm, 1=40ohm, others=RESERVED
>> +        * (select MRS1.DIC driver impedance control)
>> +        */
>> +       uint8_t ron_value;
>> +       /* RTT_NOM_VALUE: 0=40ohm, 1=60ohm, 2=120ohm, others=RESERVED */
>> +       uint8_t rtt_nom_value;
>> +       /* RD_ODT_VALUE: 0=off, 1=60ohm, 2=120ohm, 3=180ohm, others=RESERVED */
>> +       uint8_t rd_odt_value;
>> +       struct dram_params params;
>> +
>> +       /* Internally Used */
>
> I think I know what this means? It's unfortunate to have
> input/output/working data in the same structure but this seems to be
> the approach taken, so let's keep it. But can you add a comment above
> the struct saying how it is split into multiple parts?

Fixed.

>> +
>> +       /* internally used for board layout (use x8 or x16 memory) */
>> +       uint32_t board_id;
>> +       /* when set hte reconfiguration requested */
>> +       uint32_t hte_setup:1;
>> +       uint32_t menu_after_mrc:1;
>> +       uint32_t power_down_disable:1;
>> +       uint32_t tune_rcvn:1;
>
> Should these be bool? I'm not sure the :1 helps much - are you trying
> to save memory?

I just removed the :1 in the v2.

>> +       uint32_t channel_size[NUM_CHANNELS];
>> +       uint32_t column_bits[NUM_CHANNELS];
>> +       uint32_t row_bits[NUM_CHANNELS];
>> +       /* register content saved during training */
>> +       uint32_t mrs1;
>> +
>> +       /* Output */
>> +
>> +       /* initialization result (non zero specifies error code) */
>> +       uint32_t status;
>> +       /* total memory size in bytes (excludes ECC banks) */
>> +       uint32_t mem_size;
>> +       /* training results (also used on input) */
>> +       struct mrc_timings timings;
>> +};
>> +
>
> This one needs comments:

Fixed.

>> +struct mem_init {
>> +       uint16_t post_code;
>> +       uint16_t boot_path;
>> +       void (*init_fn)(struct mrc_params *mrc_params);
>> +};
>> +
>> +/* MRC platform data flags */
>> +#define MRC_FLAG_ECC_EN                0x00000001
>> +#define MRC_FLAG_SCRAMBLE_EN   0x00000002
>> +#define MRC_FLAG_MEMTEST_EN    0x00000004
>> +/* 0b DDR "fly-by" topology else 1b DDR "tree" topology */
>> +#define MRC_FLAG_TOP_TREE_EN   0x00000008
>> +/* If set ODR signal is asserted to DRAM devices on writes */
>> +#define MRC_FLAG_WR_ODT_EN     0x00000010
>> +
>> +/**
>> + * mrc - Memory Reference Code entry routine
>> + *
>> + * @mrc_params: parameters for MRC
>> + */
>> +void mrc(struct mrc_params *mrc_params);
>
> How about sdram_init() or mrc_init()?

Changed it to mrc_init() in v2.

>> +
>> +#endif /* _MRC_H_ */
>> --
>> 1.8.2.1
>>
>
> Regards,
> Simon

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC
  2015-02-04 16:24   ` Simon Glass
@ 2015-02-05 14:25     ` Bin Meng
  0 siblings, 0 replies; 29+ messages in thread
From: Bin Meng @ 2015-02-05 14:25 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, Feb 5, 2015 at 12:24 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>> Add various utility codes needed for Quark MRC.
>>
>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>>
>> ---
>> There are 12 checkpatch warnings in this patch, which are:
>>
>> warning: arch/x86/cpu/quark/mrc_util.c,1446: Too many leading tabs - consider code refactoring
>> warning: arch/x86/cpu/quark/mrc_util.c,1450: line over 80 characters
>> ...
>>
>> Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have
>> all the details on how Intel's MRC codes are actually written to play with
>> the hardware. Trying to refactor them may lead to a non-working MRC codes.
>> For the 'line over 80 characters' issue, we have to leave them as is now
>> due to the 'Too many leading tabs ...', sigh.
>
> The code looks fine for the most part - I only have nits.
>
> I'm not keen on BIT though. See my comments and what improvements you
> can make. It would be great to drop BIT.

I found it is hard to replace BIT to something meaningful, as lots of
registers are undocumented.

> Re the debug macros, I suppose they are OK to keep. U-Boot doesn't
> have the concept of debug() for different categories or levels of
> verbosity.

Yep, maybe we can enhance U-Boot's debug() in the future.

>>
>>  arch/x86/cpu/quark/hte.c      |  398 +++++++++++
>>  arch/x86/cpu/quark/hte.h      |   44 ++
>>  arch/x86/cpu/quark/mrc_util.c | 1499 +++++++++++++++++++++++++++++++++++++++++
>>  arch/x86/cpu/quark/mrc_util.h |  153 +++++
>>  4 files changed, 2094 insertions(+)
>>  create mode 100644 arch/x86/cpu/quark/hte.c
>>  create mode 100644 arch/x86/cpu/quark/hte.h
>>  create mode 100644 arch/x86/cpu/quark/mrc_util.c
>>  create mode 100644 arch/x86/cpu/quark/mrc_util.h
>>
>> diff --git a/arch/x86/cpu/quark/hte.c b/arch/x86/cpu/quark/hte.c
>> new file mode 100644
>> index 0000000..d813c9c
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/hte.c
>> @@ -0,0 +1,398 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
>
> Remove training slash?

Fixed globally.

>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#include <common.h>
>> +#include <asm/arch/mrc.h>
>> +#include <asm/arch/msg_port.h>
>> +#include "mrc_util.h"
>> +#include "hte.h"
>> +
>> +/**
>> + * This function enables HTE to detect all possible errors for
>
> s/This function// globally
>
> I'd suggest present tense, like "enable HTE to detect all possible errors for...

Fixed globally.

>> + * the given training parameters (per-bit or full byte lane).
>> + */
>> +static void hte_enable_all_errors(void)
>> +{
>> +       msg_port_write(HTE, 0x000200A2, 0xFFFFFFFF);
>> +       msg_port_write(HTE, 0x000200A3, 0x000000FF);
>> +       msg_port_write(HTE, 0x000200A4, 0x00000000);
>
> Lower case hex again.

All of Intel's MRC codes are using upper case, to keep it consistent
(or maybe I don't want to replace every place to lower case ..) I
chose not to fix these lower case, and just leave them as is now.

>> +}
>> +
>> +/**
>> + * This function goes and reads the HTE register in order to find any error
>> + *
>> + * @return: The errors detected in the HTE status register
>> + */
>> +static u32 hte_check_errors(void)
>> +{
>> +       return msg_port_read(HTE, 0x000200A7);
>> +}
>> +
>> +/**
>> + * This function waits until HTE finishes
>> + */
>> +static void hte_wait_for_complete(void)
>> +{
>> +       u32 tmp;
>> +
>> +       ENTERFN();
>> +
>> +       do {} while ((msg_port_read(HTE, 0x00020012) & BIT30) != 0);
>> +
>> +       tmp = msg_port_read(HTE, 0x00020011);
>> +       tmp |= BIT9;
>> +       tmp &= ~(BIT12 | BIT13);
>> +       msg_port_write(HTE, 0x00020011, tmp);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/**
>> + * This function clears registers related with errors in the HTE
>> + */
>> +static void hte_clear_error_regs(void)
>> +{
>> +       u32 tmp;
>> +
>> +       /*
>> +        * Clear all HTE errors and enable error checking
>> +        * for burst and chunk.
>> +        */
>> +       tmp = msg_port_read(HTE, 0x000200A1);
>> +       tmp |= BIT8;
>> +       msg_port_write(HTE, 0x000200A1, tmp);
>> +}
>> +
>> +/**
>> + * This function executes basic single cache line memory write/read/verify
>> + * test using simple constant pattern, different for READ_RAIN and
>
> REAS_TRAIN?

Fixed globally.

>> + * WRITE_TRAIN modes.
>> + *
>> + * See hte_basic_write_read() which is external visible wrapper.
>
> the external (fix below also)
>
>> + *
>> + * @mrc_params: host struture for all MRC global data
>> + * @addr: memory adress being tested (must hit specific channel/rank)
>> + * @first_run: if set then hte registers are configured, otherwise it is
>
> the hte?

Changed to 'the HTE'

>> + *             assumed configuration is done and just re-run the test
>
> assumed configuration is done the we just re-run the test
> ,
> (fix below also)

Fixed.

>> + * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern)
>> + *
>> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
>> + */
>> +static u16 hte_basic_data_cmp(struct mrc_params *mrc_params, u32 addr,
>> +                             u8 first_run, u8 mode)
>> +{
>> +       u32 pattern;
>> +       u32 offset;
>> +
>> +       if (first_run) {
>> +               msg_port_write(HTE, 0x00020020, 0x01B10021);
>> +               msg_port_write(HTE, 0x00020021, 0x06000000);
>> +               msg_port_write(HTE, 0x00020022, addr >> 6);
>> +               msg_port_write(HTE, 0x00020062, 0x00800015);
>> +               msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
>> +               msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
>> +               msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
>> +               msg_port_write(HTE, 0x00020061, 0x00030008);
>> +
>> +               if (mode == WRITE_TRAIN)
>> +                       pattern = 0xC33C0000;
>> +               else /* READ_TRAIN */
>> +                       pattern = 0xAA5555AA;
>> +
>> +               for (offset = 0x80; offset <= 0x8F; offset++)
>> +                       msg_port_write(HTE, offset, pattern);
>> +       }
>> +
>> +       msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
>> +       msg_port_write(HTE, 0x00020011, 0x00011000);
>> +       msg_port_write(HTE, 0x00020011, 0x00011100);
>> +
>> +       hte_wait_for_complete();
>> +
>> +       /*
>> +        * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for
>> +        * any bytelane errors.
>> +        */
>> +       return (hte_check_errors() >> 8) & 0xFF;
>> +}
>> +
>> +/**
>> + * This function examines single cache line memory with write/read/verify
>> + * test using multiple data patterns (victim-aggressor algorithm).
>> + *
>> + * See hte_write_stress_bit_lanes() which is external visible wrapper.
>> + *
>> + * @mrc_params: host struture for all MRC global data
>
> structure

Fixed globally.

>> + * @addr: memory adress being tested (must hit specific channel/rank)
>> + * @loop_cnt: number of test iterations
>> + * @seed_victim: victim data pattern seed
>> + * @seed_aggressor: aggressor data pattern seed
>> + * @victim_bit: should be 0 as auto rotate feature is in use
>
> auto-rotate

Fixed.

>> + * @first_run: if set then hte registers are configured, otherwise it is
>
> Actually I wonder if HTE would be better than hte, which looks like a
> 'the' typo, particularly if you leave out 'the'. Also can you please
> comment at the top of the file (first function) what HTE stands for)?

Changed to 'the HTE'. Intel doc only mentions MTE (Memory Training
Engine), and its registers are undocumented. I am not sure if this HTE
means MTE in Intel's doc. So I don't add any comment for HTE.

>> + *             assumed configuration is done and just re-run the test
>> + *
>> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
>> + */
>> +static u16 hte_rw_data_cmp(struct mrc_params *mrc_params, u32 addr,
>> +                          u8 loop_cnt, u32 seed_victim, u32 seed_aggressor,
>> +                          u8 victim_bit, u8 first_run)
>> +{
>> +       u32 offset;
>> +       u32 tmp;
>> +
>> +       if (first_run) {
>> +               msg_port_write(HTE, 0x00020020, 0x00910024);
>> +               msg_port_write(HTE, 0x00020023, 0x00810024);
>> +               msg_port_write(HTE, 0x00020021, 0x06070000);
>> +               msg_port_write(HTE, 0x00020024, 0x06070000);
>> +               msg_port_write(HTE, 0x00020022, addr >> 6);
>> +               msg_port_write(HTE, 0x00020025, addr >> 6);
>> +               msg_port_write(HTE, 0x00020062, 0x0000002A);
>> +               msg_port_write(HTE, 0x00020063, seed_victim);
>> +               msg_port_write(HTE, 0x00020064, seed_aggressor);
>> +               msg_port_write(HTE, 0x00020065, seed_victim);
>> +
>> +               /*
>> +                * Write the pattern buffers to select the victim bit
>> +                *
>> +                * Start with bit0
>> +                */
>> +               for (offset = 0x80; offset <= 0x8F; offset++) {
>> +                       if ((offset % 8) == victim_bit)
>> +                               msg_port_write(HTE, offset, 0x55555555);
>> +                       else
>> +                               msg_port_write(HTE, offset, 0xCCCCCCCC);
>> +               }
>> +
>> +               msg_port_write(HTE, 0x00020061, 0x00000000);
>> +               msg_port_write(HTE, 0x00020066, 0x03440000);
>> +               msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
>> +       }
>> +
>> +       tmp = 0x10001000 | (loop_cnt << 16);
>> +       msg_port_write(HTE, 0x00020011, tmp);
>> +       msg_port_write(HTE, 0x00020011, tmp | BIT8);
>> +
>> +       hte_wait_for_complete();
>> +
>> +       /*
>> +        * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for
>> +        * any bytelane errors.
>> +        */
>> +       return (hte_check_errors() >> 8) & 0xFF;
>> +}
>> +
>> +/**
>> + * This function uses HW HTE engine to initialize or test all memory attached
>> + * to a given DUNIT. If flag is MRC_MEM_INIT, this routine writes 0s to all
>> + * memory locations to initialize ECC. If flag is MRC_MEM_TEST, this routine
>> + * will send an 5AA55AA5 pattern to all memory locations on the RankMask and
>> + * then read it back. Then it sends an A55AA55A pattern to all memory locations
>> + * on the RankMask and reads it back.
>> + *
>> + * @mrc_params: host struture for all MRC global data
>> + * @flag: MRC_MEM_INIT or MRC_MEM_TEST
>> + *
>> + * @return: errors register showing HTE failures. Also prints out which rank
>> + *          failed the HTE test if failure occurs. For rank detection to work,
>> + *          the address map must be left in its default state. If MRC changes
>> + *          the address map, this function must be modified to change it back
>> + *          to default at the beginning, then restore it at the end.
>> + */
>> +u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag)
>> +{
>> +       u32 offset;
>> +       int test_num;
>> +       int i;
>> +
>> +       /*
>> +        * Clear out the error registers at the start of each memory
>> +        * init or memory test run.
>> +        */
>> +       hte_clear_error_regs();
>> +
>> +       msg_port_write(HTE, 0x00020062, 0x00000015);
>> +
>> +       for (offset = 0x80; offset <= 0x8F; offset++)
>> +               msg_port_write(HTE, offset, ((offset & 1) ? 0xA55A : 0x5AA5));
>> +
>> +       msg_port_write(HTE, 0x00020021, 0x00000000);
>> +       msg_port_write(HTE, 0x00020022, (mrc_params->mem_size >> 6) - 1);
>> +       msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
>> +       msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
>> +       msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
>> +       msg_port_write(HTE, 0x00020066, 0x03000000);
>> +
>> +       switch (flag) {
>> +       case MRC_MEM_INIT:
>> +               /*
>> +                * Only 1 write pass through memory is needed
>> +                * to initialize ECC
>> +                */
>> +               test_num = 1;
>> +               break;
>> +       case MRC_MEM_TEST:
>> +               /* Write/read then write/read with inverted pattern */
>> +               test_num = 4;
>> +               break;
>> +       default:
>> +               DPF(D_INFO, "Unknown parameter for flag: %d\n", flag);
>> +               return 0xFFFFFFFF;
>> +       }
>> +
>> +       DPF(D_INFO, "hte_mem_init");
>
> debug()

Keep to use DPF().

>> +
>> +       for (i = 0; i < test_num; i++) {
>> +               DPF(D_INFO, ".");
>> +
>> +               if (i == 0) {
>> +                       msg_port_write(HTE, 0x00020061, 0x00000000);
>> +                       msg_port_write(HTE, 0x00020020, 0x00110010);
>> +               } else if (i == 1) {
>> +                       msg_port_write(HTE, 0x00020061, 0x00000000);
>> +                       msg_port_write(HTE, 0x00020020, 0x00010010);
>> +               } else if (i == 2) {
>> +                       msg_port_write(HTE, 0x00020061, 0x00010100);
>> +                       msg_port_write(HTE, 0x00020020, 0x00110010);
>> +               } else {
>> +                       msg_port_write(HTE, 0x00020061, 0x00010100);
>> +                       msg_port_write(HTE, 0x00020020, 0x00010010);
>> +               }
>> +
>> +               msg_port_write(HTE, 0x00020011, 0x00111000);
>> +               msg_port_write(HTE, 0x00020011, 0x00111100);
>> +
>> +               hte_wait_for_complete();
>> +
>> +               /* If this is a READ pass, check for errors at the end */
>> +               if ((i % 2) == 1) {
>> +                       /* Return immediately if error */
>> +                       if (hte_check_errors())
>> +                               break;
>> +               }
>> +       }
>> +
>> +       DPF(D_INFO, "done\n");
>> +
>> +       return hte_check_errors();
>> +}
>> +
>> +/**
>> + * This function executes basic single cache line memory write/read/verify
>
> 'executes a basic'

Fixed.

>> + * test using simple constant pattern, different for READ_RAIN and
>> + * WRITE_TRAIN modes.
>> + *
>> + * @mrc_params: host struture for all MRC global data
>
> structure, please fix globally

Fixed globally.

>> + * @addr: memory adress being tested (must hit specific channel/rank)
>> + * @first_run: if set then hte registers are configured, otherwise it is
>> + *             assumed configuration is done and just re-run the test
>> + * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern)
>> + *
>> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
>> + */
>> +u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr,
>> +                        u8 first_run, u8 mode)
>
> Why do we use u8 for these? Would uint be good enough? Just a suggestion.
>
>> +{
>> +       u16 errors;
>> +
>> +       ENTERFN();
>> +
>> +       /* Enable all error reporting in preparation for HTE test */
>> +       hte_enable_all_errors();
>> +       hte_clear_error_regs();
>> +
>> +       errors = hte_basic_data_cmp(mrc_params, addr, first_run, mode);
>> +
>> +       LEAVEFN();
>> +
>> +       return errors;
>> +}
>> +
>> +/**
>> + * This function examines single cache line memory with write/read/verify
>
> examines a single-cache-line memory
>
> (at least I think this is what it is saying)

Fixed globally.

>> + * test using multiple data patterns (victim-aggressor algorithm).
>> + *
>> + * @mrc_params: host struture for all MRC global data
>> + * @addr: memory adress being tested (must hit specific channel/rank)
>> + * @first_run: if set then hte registers are configured, otherwise it is
>> + *             assumed configuration is done and just re-run the test
>> + *
>> + * @return: byte lane failure on each bit (for Quark only bit0 and bit1)
>> + */
>> +u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params,
>> +                              u32 addr, u8 first_run)
>> +{
>> +       u16 errors;
>> +       u8 victim_bit = 0;
>> +
>> +       ENTERFN();
>> +
>> +       /* Enable all error reporting in preparation for HTE test */
>> +       hte_enable_all_errors();
>> +       hte_clear_error_regs();
>> +
>> +       /*
>> +        * Loop through each bit in the bytelane.
>> +        *
>> +        * Each pass creates a victim bit while keeping all other bits the same
>> +        * as aggressors. AVN HTE adds an auto-rotate feature which allows us
>> +        * to program the entire victim/aggressor sequence in 1 step.
>
> What is AVN?

I don't know. I guess it might be Intel's Avoton core in some Atom
processors. So does not change the comment ..

>> +        *
>> +        * The victim bit rotates on each pass so no need to have software
>> +        * implement a victim bit loop like on VLV.
>
> VLV? I think it is sometimes better to write these out and put the
> abbreviation in brackets after it, at least once in the file.

Like above, I guess it is Valleyview (VLV). Don't know if it is
correct. So keep it unchanged.

>
>> +        */
>> +       errors = hte_rw_data_cmp(mrc_params, addr, HTE_LOOP_CNT,
>> +                                HTE_LFSR_VICTIM_SEED, HTE_LFSR_AGRESSOR_SEED,
>> +                                victim_bit, first_run);
>> +
>> +       LEAVEFN();
>> +
>> +       return errors;
>> +}
>> +
>> +/**
>> + * This function execute basic single cache line memory write or read.
>
> as above
>
>> + * This is just for receive enable / fine write levelling purpose.
>
> write-levelling (I think that's what you mean)

Fixed

>> + *
>> + * @addr: memory adress being tested (must hit specific channel/rank)
>> + * @first_run: if set then hte registers are configured, otherwise it is
>> + *             assumed configuration is done and just re-run the test
>> + * @is_write: when non-zero memory write operation executed, otherwise read
>> + */
>> +void hte_mem_op(u32 addr, u8 first_run, u8 is_write)
>> +{
>> +       u32 offset;
>> +       u32 tmp;
>> +
>> +       hte_enable_all_errors();
>> +       hte_clear_error_regs();
>> +
>> +       if (first_run) {
>> +               tmp = is_write ? 0x01110021 : 0x01010021;
>> +               msg_port_write(HTE, 0x00020020, tmp);
>> +
>> +               msg_port_write(HTE, 0x00020021, 0x06000000);
>> +               msg_port_write(HTE, 0x00020022, addr >> 6);
>> +               msg_port_write(HTE, 0x00020062, 0x00800015);
>> +               msg_port_write(HTE, 0x00020063, 0xAAAAAAAA);
>> +               msg_port_write(HTE, 0x00020064, 0xCCCCCCCC);
>> +               msg_port_write(HTE, 0x00020065, 0xF0F0F0F0);
>> +               msg_port_write(HTE, 0x00020061, 0x00030008);
>> +
>> +               for (offset = 0x80; offset <= 0x8F; offset++)
>> +                       msg_port_write(HTE, offset, 0xC33C0000);
>> +       }
>> +
>> +       msg_port_write(HTE, 0x000200A1, 0xFFFF1000);
>> +       msg_port_write(HTE, 0x00020011, 0x00011000);
>> +       msg_port_write(HTE, 0x00020011, 0x00011100);
>> +
>> +       hte_wait_for_complete();
>> +}
>> diff --git a/arch/x86/cpu/quark/hte.h b/arch/x86/cpu/quark/hte.h
>> new file mode 100644
>> index 0000000..3a173ea
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/hte.h
>> @@ -0,0 +1,44 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#ifndef _HTE_H_
>> +#define _HTE_H_
>> +
>> +enum {
>> +       MRC_MEM_INIT,
>> +       MRC_MEM_TEST
>> +};
>> +
>> +enum {
>> +       READ_TRAIN,
>> +       WRITE_TRAIN
>> +};
>> +
>> +/*
>> + * EXP_LOOP_CNT field of HTE_CMD_CTL
>> + *
>> + * This CANNOT be less than 4!
>> + */
>> +#define HTE_LOOP_CNT           5
>> +
>> +/* random seed for victim */
>> +#define HTE_LFSR_VICTIM_SEED   0xF294BA21
>> +
>> +/* random seed for aggressor */
>> +#define HTE_LFSR_AGRESSOR_SEED 0xEBA7492D
>> +
>> +u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag);
>> +u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr,
>> +                        u8 first_run, u8 mode);
>> +u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params,
>> +                              u32 addr, u8 first_run);
>> +void hte_mem_op(u32 addr, u8 first_run, u8 is_write);
>
> Can you move the comments from the .c to the .h for these exported functions?
>

These routines are only used by MRC internally, and not public APIs.
Thus I don't move the comments to header files.

>> +
>> +#endif /* _HTE_H_ */
>> diff --git a/arch/x86/cpu/quark/mrc_util.c b/arch/x86/cpu/quark/mrc_util.c
>> new file mode 100644
>> index 0000000..1ae42d6
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/mrc_util.c
>> @@ -0,0 +1,1499 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#include <common.h>
>> +#include <asm/arch/device.h>
>> +#include <asm/arch/mrc.h>
>> +#include <asm/arch/msg_port.h>
>> +#include "mrc_util.h"
>> +#include "hte.h"
>> +#include "smc.h"
>> +
>> +static const uint8_t vref_codes[64] = {
>> +       /* lowest to highest */
>> +       0x3F, 0x3E, 0x3D, 0x3C, 0x3B, 0x3A, 0x39, 0x38,
>> +       0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x31, 0x30,
>> +       0x2F, 0x2E, 0x2D, 0x2C, 0x2B, 0x2A, 0x29, 0x28,
>> +       0x27, 0x26, 0x25, 0x24, 0x23, 0x22, 0x21, 0x20,
>> +       0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
>> +       0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
>> +       0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
>> +       0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F
>> +};
>> +
>> +void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask)
>> +{
>> +       msg_port_write(unit, addr,
>> +                      (msg_port_read(unit, addr) & ~(mask)) |
>> +                      ((data) & (mask)));
>> +}
>> +
>> +void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask)
>> +{
>> +       msg_port_alt_write(unit, addr,
>> +                          (msg_port_alt_read(unit, addr) & ~(mask)) |
>> +                          ((data) & (mask)));
>> +}
>> +
>> +void mrc_post_code(uint8_t major, uint8_t minor)
>> +{
>> +       /* send message to UART */
>> +       DPF(D_INFO, "POST: 0x%01x%02x\n", major, minor);
>> +
>> +       /* error check */
>> +       if (major == 0xEE)
>> +               hang();
>> +}
>> +
>> +/* Delay number of nanoseconds */
>> +void delay_n(uint32_t ns)
>> +{
>> +       /* 1000 MHz clock has 1ns period --> no conversion required */
>> +       uint64_t final_tsc = rdtsc();
>
> blank line here after declarations end
>
Fixed.

>> +       final_tsc += ((get_tbclk_mhz() * ns) / 1000);
>> +
>> +       while (rdtsc() < final_tsc)
>> +               ;
>> +}
>> +
>> +/* Delay number of microseconds */
>> +void delay_u(uint32_t ms)
>> +{
>> +       /* 64-bit math is not an option, just use loops */
>> +       while (ms--)
>> +               delay_n(1000);
>> +}
>
> Some day I suspect these could be pulled out into general x86
> functions. Let's see if anything else needs them first.

Agreed.

>> +
>> +/* Select Memory Manager as the source for PRI interface */
>> +void select_mem_mgr(void)
>> +{
>> +       u32 dco;
>> +
>> +       ENTERFN();
>> +
>> +       dco = msg_port_read(MEM_CTLR, DCO);
>> +       dco &= ~BIT28;
>
> ~(1 << 28)
>
> Ah but I see you are using this everywhere.
>
> U-Boot tries to avoid defining this sort of thing. See some comments
> below about this.

Did not fix the BIT stuff.

>> +       msg_port_write(MEM_CTLR, DCO, dco);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/* Select HTE as the source for PRI interface */
>> +void select_hte(void)
>> +{
>> +       u32 dco;
>> +
>> +       ENTERFN();
>> +
>> +       dco = msg_port_read(MEM_CTLR, DCO);
>> +       dco |= BIT28;
>> +       msg_port_write(MEM_CTLR, DCO, dco);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Send DRAM command
>> + * data should be formated using DCMD_Xxxx macro or emrsXCommand structure
>> + */
>> +void dram_init_command(uint32_t data)
>> +{
>> +       pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_DATA_REG, data);
>> +       pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_CTRL_EXT_REG, 0);
>> +       msg_port_setup(MSG_OP_DRAM_INIT, MEM_CTLR, 0);
>> +
>> +       DPF(D_REGWR, "WR32 %03X %08X %08X\n", MEM_CTLR, 0, data);
>> +}
>> +
>> +/* Send DRAM wake command using special MCU side-band WAKE opcode */
>> +void dram_wake_command(void)
>> +{
>> +       ENTERFN();
>> +
>> +       msg_port_setup(MSG_OP_DRAM_WAKE, MEM_CTLR, 0);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane)
>> +{
>> +       /* send message to UART */
>> +       DPF(D_INFO, "CH%01X RK%01X BL%01X\n", channel, rank, byte_lane);
>> +}
>> +
>> +/*
>> + * This function will program the RCVEN delays
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +void set_rcvn(uint8_t channel, uint8_t rank,
>> +             uint8_t byte_lane, uint32_t pi_count)
>
> reformat to 80cols. Should this or any other function in this file be static?

These functions are used by MRC internally. Also I did not fix this
'reformat to 80cols) as I think it is fine.

>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       DPF(D_TRN, "Rcvn ch%d rnk%d ln%d : pi=%03X\n",
>> +           channel, rank, byte_lane, pi_count);
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * BL0 -> B01PTRCTL0[11:08] (0x0-0xF)
>> +        * BL1 -> B01PTRCTL0[23:20] (0x0-0xF)
>> +        */
>> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       msk = (byte_lane & BIT0) ? (BIT23 | BIT22 | BIT21 | BIT20) :
>> +               (BIT11 | BIT10 | BIT9 | BIT8);
>
> Would this be better as:
>
> (0xf << 20) | (0xf << 8)
>
> It might be more meaningful also.
>
> I really don't think these long strings of | are nice.

I agree, but I did not fix this. Changing those globally is really
error prone and may break the whole MRC (my brain could be bad
converting these bits to hex numbers).

>> +       temp = (byte_lane & BIT0) ? ((pi_count / HALF_CLK) << 20) :
>> +               ((pi_count / HALF_CLK) << 8);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F)
>> +        * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F)
>
> lower case hex again

Did not fix these for consistency.

>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
>> +       temp = pi_count << 24;
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /*
>> +        * DEADBAND
>> +        * BL0/1 -> B01DBCTL1[08/11] (+1 select)
>> +        * BL0/1 -> B01DBCTL1[02/05] (enable)
>> +        */
>> +       reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       msk = 0x00;
>> +       temp = 0x00;
>> +
>> +       /* enable */
>> +       msk |= (byte_lane & BIT0) ? (BIT5) : (BIT2);
>
> Remove () around BIT5

Fixed globally.

>> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
>> +               temp |= msk;
>> +
>> +       /* select */
>> +       msk |= (byte_lane & BIT0) ? (BIT11) : (BIT8);
>> +       if (pi_count < EARLY_DB)
>> +               temp |= msk;
>
> These uses of BIT seem more useful to me.
>
> Still it would be better to have #defines for the bits which actually
> describe their meaning.
>
> Maybe you don't know the meaning though...

Yes!!!

>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check */
>> +       if (pi_count > 0x3F) {
>> +               training_message(channel, rank, byte_lane);
>> +               mrc_post_code(0xEE, 0xE0);
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the current RCVEN delay on the given
>> + * channel, rank, byte_lane as an absolute PI count.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * BL0 -> B01PTRCTL0[11:08] (0x0-0xF)
>> +        * BL1 -> B01PTRCTL0[23:20] (0x0-0xF)
>> +        */
>> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= (byte_lane & BIT0) ? (20) : (8);
>> +       temp &= 0xF;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = temp * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F)
>> +        * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
>
> Please avoid () around simple constants. Put them in the #define/enum if needed.

Fixed globally by removing ().

>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 24;
>> +       temp &= 0x3F;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count += temp;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the RDQS delays based on an absolute
>> + * amount of PIs.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +void set_rdqs(uint8_t channel, uint8_t rank,
>> +             uint8_t byte_lane, uint32_t pi_count)
>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +       DPF(D_TRN, "Rdqs ch%d rnk%d ln%d : pi=%03X\n",
>> +           channel, rank, byte_lane, pi_count);
>> +
>> +       /*
>> +        * PI (1/128 MCLK)
>> +        * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47)
>> +        * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1RXDQSPICODE) : (B0RXDQSPICODE);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       msk = (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0);
>> +       temp = pi_count << 0;
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check (shouldn't go above 0x3F) */
>> +       if (pi_count > 0x47) {
>> +               training_message(channel, rank, byte_lane);
>> +               mrc_post_code(0xEE, 0xE1);
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the current RDQS delay on the given
>> + * channel, rank, byte_lane as an absolute PI count.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * PI (1/128 MCLK)
>> +        * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47)
>> +        * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1RXDQSPICODE) : (B0RXDQSPICODE);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = temp & 0x7F;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the WDQS delays based on an absolute
>> + * amount of PIs.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +void set_wdqs(uint8_t channel, uint8_t rank,
>> +             uint8_t byte_lane, uint32_t pi_count)
>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       DPF(D_TRN, "Wdqs ch%d rnk%d ln%d : pi=%03X\n",
>> +           channel, rank, byte_lane, pi_count);
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * BL0 -> B01PTRCTL0[07:04] (0x0-0xF)
>> +        * BL1 -> B01PTRCTL0[19:16] (0x0-0xF)
>> +        */
>> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       msk = (byte_lane & BIT0) ? (BIT19 | BIT18 | BIT17 | BIT16) :
>> +               (BIT7 | BIT6 | BIT5 | BIT4);
>> +       temp = pi_count / HALF_CLK;
>> +       temp <<= (byte_lane & BIT0) ? (16) : (4);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F)
>> +        * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16);
>> +       temp = pi_count << 16;
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /*
>> +        * DEADBAND
>> +        * BL0/1 -> B01DBCTL1[07/10] (+1 select)
>> +        * BL0/1 -> B01DBCTL1[01/04] (enable)
>> +        */
>> +       reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       msk = 0x00;
>> +       temp = 0x00;
>> +
>> +       /* enable */
>> +       msk |= (byte_lane & BIT0) ? (BIT4) : (BIT1);
>> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
>> +               temp |= msk;
>> +
>> +       /* select */
>> +       msk |= (byte_lane & BIT0) ? (BIT10) : (BIT7);
>> +       if (pi_count < EARLY_DB)
>> +               temp |= msk;
>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check */
>> +       if (pi_count > 0x3F) {
>> +               training_message(channel, rank, byte_lane);
>> +               mrc_post_code(0xEE, 0xE2);
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the amount of WDQS delay on the given
>> + * channel, rank, byte_lane as an absolute PI count.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * BL0 -> B01PTRCTL0[07:04] (0x0-0xF)
>> +        * BL1 -> B01PTRCTL0[19:16] (0x0-0xF)
>> +        */
>> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= (byte_lane & BIT0) ? (16) : (4);
>> +       temp &= 0xF;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = (temp * HALF_CLK);
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F)
>> +        * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 16;
>> +       temp &= 0x3F;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count += temp;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the WDQ delays based on an absolute
>> + * number of PIs.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +void set_wdq(uint8_t channel, uint8_t rank,
>> +            uint8_t byte_lane, uint32_t pi_count)
>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       DPF(D_TRN, "Wdq ch%d rnk%d ln%d : pi=%03X\n",
>> +           channel, rank, byte_lane, pi_count);
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * BL0 -> B01PTRCTL0[03:00] (0x0-0xF)
>> +        * BL1 -> B01PTRCTL0[15:12] (0x0-0xF)
>> +        */
>> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       msk = (byte_lane & BIT0) ? (BIT15 | BIT14 | BIT13 | BIT12) :
>> +               (BIT3 | BIT2 | BIT1 | BIT0);
>> +       temp = pi_count / HALF_CLK;
>> +       temp <<= (byte_lane & BIT0) ? (12) : (0);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F)
>> +        * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       msk = (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
>> +       temp = pi_count << 8;
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /*
>> +        * DEADBAND
>> +        * BL0/1 -> B01DBCTL1[06/09] (+1 select)
>> +        * BL0/1 -> B01DBCTL1[00/03] (enable)
>> +        */
>> +       reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       msk = 0x00;
>> +       temp = 0x00;
>> +
>> +       /* enable */
>> +       msk |= (byte_lane & BIT0) ? (BIT3) : (BIT0);
>> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
>> +               temp |= msk;
>> +
>> +       /* select */
>> +       msk |= (byte_lane & BIT0) ? (BIT9) : (BIT6);
>> +       if (pi_count < EARLY_DB)
>> +               temp |= msk;
>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check */
>> +       if (pi_count > 0x3F) {
>> +               training_message(channel, rank, byte_lane);
>> +               mrc_post_code(0xEE, 0xE3);
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the amount of WDQ delay on the given
>> + * channel, rank, byte_lane as an absolute PI count.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * BL0 -> B01PTRCTL0[03:00] (0x0-0xF)
>> +        * BL1 -> B01PTRCTL0[15:12] (0x0-0xF)
>> +        */
>> +       reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= (byte_lane & BIT0) ? (12) : (0);
>> +       temp &= 0xF;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = (temp * HALF_CLK);
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F)
>> +        * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F)
>> +        */
>> +       reg = (byte_lane & BIT0) ? (B1DLLPICODER0) : (B0DLLPICODER0);
>> +       reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) +
>> +               (channel * DDRIODQ_CH_OFFSET));
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 8;
>> +       temp &= 0x3F;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count += temp;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the WCMD delays based on an absolute
>> + * number of PIs.
>> + */
>> +void set_wcmd(uint8_t channel, uint32_t pi_count)
>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * CMDPTRREG[11:08] (0x0-0xF)
>> +        */
>> +       reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = (BIT11 | BIT10 | BIT9 | BIT8);
>> +       temp = pi_count / HALF_CLK;
>> +       temp <<= 8;
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused)
>> +        * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused)
>> +        * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused)
>> +        * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused)
>> +        * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused)
>> +        * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F)
>> +        * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused)
>> +        * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused)
>> +        */
>> +       reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
>> +
>> +       msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
>> +               BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
>> +               BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0);
>> +
>> +       temp = (pi_count << 24) | (pi_count << 16) |
>> +               (pi_count << 8) | (pi_count << 0);
>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = CMDDLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);  /* PO */
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /*
>> +        * DEADBAND
>> +        * CMDCFGREG0[17] (+1 select)
>> +        * CMDCFGREG0[16] (enable)
>> +        */
>> +       reg = CMDCFGREG0 + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = 0x00;
>> +       temp = 0x00;
>> +
>> +       /* enable */
>> +       msk |= BIT16;
>> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
>> +               temp |= msk;
>> +
>> +       /* select */
>> +       msk |= BIT17;
>> +       if (pi_count < EARLY_DB)
>> +               temp |= msk;
>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check */
>> +       if (pi_count > 0x3F)
>> +               mrc_post_code(0xEE, 0xE4);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the amount of WCMD delay on the given
>> + * channel as an absolute PI count.
>> + */
>> +uint32_t get_wcmd(uint8_t channel)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * CMDPTRREG[11:08] (0x0-0xF)
>> +        */
>> +       reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 8;
>> +       temp &= 0xF;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = temp * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused)
>> +        * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused)
>> +        * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused)
>> +        * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused)
>> +        * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused)
>> +        * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F)
>> +        * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused)
>> +        * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused)
>> +        */
>> +       reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 16;
>> +       temp &= 0x3F;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count += temp;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the WCLK delays based on an absolute
>> + * number of PIs.
>> + */
>> +void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count)
>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * CCPTRREG[15:12] -> CLK1 (0x0-0xF)
>> +        * CCPTRREG[11:08] -> CLK0 (0x0-0xF)
>> +        */
>> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = (BIT15 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
>
> mask = 0xff00 is much better, isn't it?

Agreed, but did not fix it.

>> +       temp = ((pi_count / HALF_CLK) << 12) | ((pi_count / HALF_CLK) << 8);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F)
>> +        * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F)
>> +        */
>> +       reg = (rank) ? (ECCB1DLLPICODER0) : (ECCB1DLLPICODER0);
>> +       reg += (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
>> +               BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8);
>
> Ick!
>

Echo!!

>> +       temp = (pi_count << 16) | (pi_count << 8);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = (rank) ? (ECCB1DLLPICODER1) : (ECCB1DLLPICODER1);
>
> Remove all (), and below. Please fix globally.

Fixed globally.

>> +       reg += (channel * DDRIOCCC_CH_OFFSET);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = (rank) ? (ECCB1DLLPICODER2) : (ECCB1DLLPICODER2);
>> +       reg += (channel * DDRIOCCC_CH_OFFSET);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = (rank) ? (ECCB1DLLPICODER3) : (ECCB1DLLPICODER3);
>> +       reg += (channel * DDRIOCCC_CH_OFFSET);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /*
>> +        * DEADBAND
>> +        * CCCFGREG1[11:08] (+1 select)
>> +        * CCCFGREG1[03:00] (enable)
>> +        */
>> +       reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = 0x00;
>> +       temp = 0x00;
>> +
>> +       /* enable */
>> +       msk |= (BIT3 | BIT2 | BIT1 | BIT0);
>> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
>> +               temp |= msk;
>> +
>> +       /* select */
>> +       msk |= (BIT11 | BIT10 | BIT9 | BIT8);
>> +       if (pi_count < EARLY_DB)
>> +               temp |= msk;
>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check */
>> +       if (pi_count > 0x3F)
>> +               mrc_post_code(0xEE, 0xE5);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the amout of WCLK delay on the given
>> + * channel, rank as an absolute PI count.
>> + */
>> +uint32_t get_wclk(uint8_t channel, uint8_t rank)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * CCPTRREG[15:12] -> CLK1 (0x0-0xF)
>> +        * CCPTRREG[11:08] -> CLK0 (0x0-0xF)
>> +        */
>> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= (rank) ? (12) : (8);
>> +       temp &= 0xF;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = temp * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F)
>> +        * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F)
>> +        */
>> +       reg = (rank) ? (ECCB1DLLPICODER0) : (ECCB1DLLPICODER0);
>> +       reg += (channel * DDRIOCCC_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= (rank) ? (16) : (8);
>> +       temp &= 0x3F;
>> +
>> +       pi_count += temp;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the WCTL delays based on an absolute
>> + * number of PIs.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count)
>> +{
>> +       uint32_t reg;
>> +       uint32_t msk;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * CCPTRREG[31:28] (0x0-0xF)
>> +        * CCPTRREG[27:24] (0x0-0xF)
>> +        */
>> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
>> +       temp = ((pi_count / HALF_CLK) << 28) | ((pi_count / HALF_CLK) << 24);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
>> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
>> +        */
>> +       reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24);
>> +       temp = (pi_count << 24);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = ECCB1DLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = ECCB1DLLPICODER2 + (channel * DDRIOCCC_CH_OFFSET);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +       reg = ECCB1DLLPICODER3 + (channel * DDRIOCCC_CH_OFFSET);
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /*
>> +        * DEADBAND
>> +        * CCCFGREG1[13:12] (+1 select)
>> +        * CCCFGREG1[05:04] (enable)
>> +        */
>> +       reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET);
>> +       msk = 0x00;
>> +       temp = 0x00;
>> +
>> +       /* enable */
>> +       msk |= (BIT5 | BIT4);
>> +       if ((pi_count < EARLY_DB) || (pi_count > LATE_DB))
>> +               temp |= msk;
>> +
>> +       /* select */
>> +       msk |= (BIT13 | BIT12);
>> +       if (pi_count < EARLY_DB)
>> +               temp |= msk;
>> +
>> +       mrc_alt_write_mask(DDRPHY, reg, temp, msk);
>> +
>> +       /* error check */
>> +       if (pi_count > 0x3F)
>> +               mrc_post_code(0xEE, 0xE6);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the amount of WCTL delay on the given
>> + * channel, rank as an absolute PI count.
>> + *
>> + * (currently doesn't comprehend rank)
>> + */
>> +uint32_t get_wctl(uint8_t channel, uint8_t rank)
>> +{
>> +       uint32_t reg;
>> +       uint32_t temp;
>> +       uint32_t pi_count;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * RDPTR (1/2 MCLK, 64 PIs)
>> +        * CCPTRREG[31:28] (0x0-0xF)
>> +        * CCPTRREG[27:24] (0x0-0xF)
>> +        */
>> +       reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 24;
>> +       temp &= 0xF;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count = temp * HALF_CLK;
>> +
>> +       /*
>> +        * PI (1/64 MCLK, 1 PIs)
>> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
>> +        * ECCB1DLLPICODER?[29:24] (0x00-0x3F)
>> +        */
>> +       reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET);
>> +       temp = msg_port_alt_read(DDRPHY, reg);
>> +       temp >>= 24;
>> +       temp &= 0x3F;
>> +
>> +       /* Adjust PI_COUNT */
>> +       pi_count += temp;
>> +
>> +       LEAVEFN();
>> +
>> +       return pi_count;
>> +}
>> +
>> +/*
>> + * This function will program the internal Vref setting in a given
>> + * byte lane in a given channel.
>> + */
>> +void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting)
>> +{
>> +       uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL);
>> +
>> +       ENTERFN();
>> +
>> +       DPF(D_TRN, "Vref ch%d ln%d : val=%03X\n",
>> +           channel, byte_lane, setting);
>> +
>> +       mrc_alt_write_mask(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) +
>> +               ((byte_lane >> 1) * DDRIODQ_BL_OFFSET)),
>> +               (vref_codes[setting] << 2),
>> +               (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
>> +
>> +       /*
>> +        * need to wait ~300ns for Vref to settle
>> +        * (check that this is necessary)
>> +        */
>> +       delay_n(300);
>> +
>> +       /* ??? may need to clear pointers ??? */
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return the internal Vref setting for the given
>> + * channel, byte_lane.
>> + */
>> +uint32_t get_vref(uint8_t channel, uint8_t byte_lane)
>> +{
>> +       uint8_t j;
>> +       uint32_t ret_val = sizeof(vref_codes) / 2;
>> +       uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL);
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       temp = msg_port_alt_read(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) +
>> +               ((byte_lane >> 1) * DDRIODQ_BL_OFFSET)));
>> +       temp >>= 2;
>> +       temp &= 0x3F;
>> +
>> +       for (j = 0; j < sizeof(vref_codes); j++) {
>> +               if (vref_codes[j] == temp) {
>> +                       ret_val = j;
>> +                       break;
>> +               }
>> +       }
>> +
>> +       LEAVEFN();
>> +
>> +       return ret_val;
>> +}
>> +
>> +/*
>> + * This function will return a 32 bit address in the desired
>
> 32-bit

Fixed

>> + * channel and rank.
>> + */
>> +uint32_t get_addr(uint8_t channel, uint8_t rank)
>> +{
>> +       uint32_t offset = 0x02000000;   /* 32MB */
>> +
>> +       /* Begin product specific code */
>> +       if (channel > 0) {
>> +               DPF(D_ERROR, "ILLEGAL CHANNEL\n");
>> +               DEAD_LOOP();
>> +       }
>> +
>> +       if (rank > 1) {
>> +               DPF(D_ERROR, "ILLEGAL RANK\n");
>> +               DEAD_LOOP();
>> +       }
>> +
>> +       /* use 256MB lowest density as per DRP == 0x0003 */
>> +       offset += rank * (256 * 1024 * 1024);
>> +
>> +       return offset;
>> +}
>> +
>> +/*
>> + * This function will sample the DQTRAINSTS registers in the given
>> + * channel/rank SAMPLE_SIZE times looking for a valid '0' or '1'.
>> + *
>> + * It will return an encoded DWORD in which each bit corresponds to
>
> DWORD?

Changed to 32-bit data

>> + * the sampled value on the byte lane.
>> + */
>> +uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel,
>> +                   uint8_t rank, bool rcvn)
>> +{
>> +       uint8_t j;      /* just a counter */
>> +       uint8_t bl;     /* which BL in the module (always 2 per module) */
>> +       uint8_t bl_grp; /* which BL module */
>> +       /* byte lane divisor */
>
> Maybe rename the variable so you can drop the comment?

That does not help much. Unchanged.

>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +       uint32_t msk[2];        /* BLx in module */
>> +       /* DQTRAINSTS register contents for each sample */
>> +       uint32_t sampled_val[SAMPLE_SIZE];
>> +       uint32_t num_0s;        /* tracks the number of '0' samples */
>> +       uint32_t num_1s;        /* tracks the number of '1' samples */
>> +       uint32_t ret_val = 0x00;        /* assume all '0' samples */
>> +       uint32_t address = get_addr(channel, rank);
>> +
>> +       /* initialise msk[] */
>> +       msk[0] = (rcvn) ? (BIT1) : (BIT9);      /* BL0 */
>> +       msk[1] = (rcvn) ? (BIT0) : (BIT8);      /* BL1 */
>> +
>> +       /* cycle through each byte lane group */
>> +       for (bl_grp = 0; bl_grp < (NUM_BYTE_LANES / bl_divisor) / 2; bl_grp++) {
>> +               /* take SAMPLE_SIZE samples */
>> +               for (j = 0; j < SAMPLE_SIZE; j++) {
>> +                       hte_mem_op(address, mrc_params->first_run,
>> +                                  rcvn ? 0 : 1);
>> +                       mrc_params->first_run = 0;
>> +
>> +                       /*
>> +                        * record the contents of the proper
>> +                        * DQTRAINSTS register
>> +                        */
>> +                       sampled_val[j] = msg_port_alt_read(DDRPHY,
>> +                               (DQTRAINSTS +
>> +                               (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                               (channel * DDRIODQ_CH_OFFSET)));
>> +               }
>> +
>> +               /*
>> +                * look for a majority value (SAMPLE_SIZE / 2) + 1
>> +                * on the byte lane and set that value in the corresponding
>> +                * ret_val bit
>> +                */
>> +               for (bl = 0; bl < 2; bl++) {
>> +                       num_0s = 0x00;  /* reset '0' tracker for byte lane */
>> +                       num_1s = 0x00;  /* reset '1' tracker for byte lane */
>> +                       for (j = 0; j < SAMPLE_SIZE; j++) {
>> +                               if (sampled_val[j] & msk[bl])
>> +                                       num_1s++;
>> +                               else
>> +                                       num_0s++;
>> +                       }
>> +               if (num_1s > num_0s)
>> +                       ret_val |= (1 << (bl + (bl_grp * 2)));
>> +               }
>> +       }
>> +
>> +       /*
>> +        * "ret_val.0" contains the status of BL0
>> +        * "ret_val.1" contains the status of BL1
>> +        * "ret_val.2" contains the status of BL2
>> +        * etc.
>
> This comment should go in @return in the function comment.

Actually I failed to understand what it really means, so leave it unchanged :(

>> +        */
>> +       return ret_val;
>> +}
>> +
>> +/* This function will find the rising edge transition on RCVN or WDQS */
>> +void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[],
>> +                     uint8_t channel, uint8_t rank, bool rcvn)
>> +{
>> +       bool all_edges_found;   /* determines stop condition */
>> +       bool direction[NUM_BYTE_LANES]; /* direction indicator */
>> +       uint8_t sample; /* sample counter */
>> +       uint8_t bl;     /* byte lane counter */
>> +       /* byte lane divisor */
>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +       uint32_t sample_result[SAMPLE_CNT];     /* results of sample_dqs() */
>> +       uint32_t temp;
>> +       uint32_t transition_pattern;
>> +
>> +       ENTERFN();
>> +
>> +       /* select hte and request initial configuration */
>> +       select_hte();
>> +       mrc_params->first_run = 1;
>> +
>> +       /* Take 3 sample points (T1,T2,T3) to obtain a transition pattern */
>> +       for (sample = 0; sample < SAMPLE_CNT; sample++) {
>> +               /* program the desired delays for sample */
>> +               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                       /* increase sample delay by 26 PI (0.2 CLK) */
>> +                       if (rcvn) {
>> +                               set_rcvn(channel, rank, bl,
>> +                                        delay[bl] + (sample * SAMPLE_DLY));
>> +                       } else {
>> +                               set_wdqs(channel, rank, bl,
>> +                                        delay[bl] + (sample * SAMPLE_DLY));
>> +                       }
>> +               }
>> +
>> +               /* take samples (Tsample_i) */
>> +               sample_result[sample] = sample_dqs(mrc_params,
>> +                       channel, rank, rcvn);
>> +
>> +               DPF(D_TRN,
>> +                   "Find rising edge %s ch%d rnk%d: #%d dly=%d dqs=%02X\n",
>> +                   (rcvn ? "RCVN" : "WDQS"), channel, rank, sample,
>> +                   sample * SAMPLE_DLY, sample_result[sample]);
>> +       }
>> +
>> +       /*
>> +        * This pattern will help determine where we landed and ultimately
>> +        * how to place RCVEN/WDQS.
>> +        */
>> +       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +               /* build transition_pattern (MSB is 1st sample) */
>> +               transition_pattern = 0;
>> +               for (sample = 0; sample < SAMPLE_CNT; sample++) {
>> +                       transition_pattern |=
>> +                               ((sample_result[sample] & (1 << bl)) >> bl) <<
>> +                               (SAMPLE_CNT - 1 - sample);
>> +               }
>> +
>> +               DPF(D_TRN, "=== transition pattern %d\n", transition_pattern);
>> +
>> +               /*
>> +                * set up to look for rising edge based on
>> +                * transition_pattern
>> +                */
>> +               switch (transition_pattern) {
>> +               case 0: /* sampled 0->0->0 */
>> +                       /* move forward from T3 looking for 0->1 */
>> +                       delay[bl] += 2 * SAMPLE_DLY;
>> +                       direction[bl] = FORWARD;
>> +                       break;
>> +               case 1: /* sampled 0->0->1 */
>> +               case 5: /* sampled 1->0->1 (bad duty cycle) *HSD#237503* */
>> +                       /* move forward from T2 looking for 0->1 */
>> +                       delay[bl] += 1 * SAMPLE_DLY;
>> +                       direction[bl] = FORWARD;
>> +                       break;
>> +               case 2: /* sampled 0->1->0 (bad duty cycle) *HSD#237503* */
>> +               case 3: /* sampled 0->1->1 */
>> +                       /* move forward from T1 looking for 0->1 */
>> +                       delay[bl] += 0 * SAMPLE_DLY;
>> +                       direction[bl] = FORWARD;
>> +                       break;
>> +               case 4: /* sampled 1->0->0 (assumes BL8, HSD#234975) */
>> +                       /* move forward from T3 looking for 0->1 */
>> +                       delay[bl] += 2 * SAMPLE_DLY;
>> +                       direction[bl] = FORWARD;
>> +                       break;
>> +               case 6: /* sampled 1->1->0 */
>> +               case 7: /* sampled 1->1->1 */
>> +                       /* move backward from T1 looking for 1->0 */
>> +                       delay[bl] += 0 * SAMPLE_DLY;
>> +                       direction[bl] = BACKWARD;
>> +                       break;
>> +               default:
>> +                       mrc_post_code(0xEE, 0xEE);
>> +                       break;
>> +               }
>> +
>> +               /* program delays */
>> +               if (rcvn)
>> +                       set_rcvn(channel, rank, bl, delay[bl]);
>> +               else
>> +                       set_wdqs(channel, rank, bl, delay[bl]);
>> +       }
>> +
>> +       /*
>> +        * Based on the observed transition pattern on the byte lane,
>> +        * begin looking for a rising edge with single PI granularity.
>> +        */
>> +       do {
>> +               all_edges_found = true; /* assume all byte lanes passed */
>> +               /* take a sample */
>> +               temp = sample_dqs(mrc_params, channel, rank, rcvn);
>> +               /* check all each byte lane for proper edge */
>> +               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                       if (temp & (1 << bl)) {
>> +                               /* sampled "1" */
>> +                               if (direction[bl] == BACKWARD) {
>> +                                       /*
>> +                                        * keep looking for edge
>> +                                        * on this byte lane
>> +                                        */
>> +                                       all_edges_found = false;
>> +                                       delay[bl] -= 1;
>> +                                       if (rcvn) {
>> +                                               set_rcvn(channel, rank,
>> +                                                        bl, delay[bl]);
>> +                                       } else {
>> +                                               set_wdqs(channel, rank,
>> +                                                        bl, delay[bl]);
>> +                                       }
>> +                               }
>> +                       } else {
>> +                               /* sampled "0" */
>> +                               if (direction[bl] == FORWARD) {
>> +                                       /*
>> +                                        * keep looking for edge
>> +                                        * on this byte lane
>> +                                        */
>> +                                       all_edges_found = false;
>> +                                       delay[bl] += 1;
>> +                                       if (rcvn) {
>> +                                               set_rcvn(channel, rank,
>> +                                                        bl, delay[bl]);
>> +                                       } else {
>> +                                               set_wdqs(channel, rank,
>> +                                                        bl, delay[bl]);
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       } while (!all_edges_found);
>> +
>> +       /* restore DDR idle state */
>> +       dram_init_command(DCMD_PREA(rank));
>> +
>> +       DPF(D_TRN, "Delay %03X %03X %03X %03X\n",
>> +           delay[0], delay[1], delay[2], delay[3]);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will return a 32 bit mask that will be used to
>> + * check for byte lane failures.
>> + */
>> +uint32_t byte_lane_mask(struct mrc_params *mrc_params)
>> +{
>> +       uint32_t j;
>> +       uint32_t ret_val = 0x00;
>> +
>> +       /*
>> +        * set ret_val based on NUM_BYTE_LANES such that you will check
>> +        * only BL0 in result
>> +        *
>> +        * (each bit in result represents a byte lane)
>> +        */
>> +       for (j = 0; j < MAX_BYTE_LANES; j += NUM_BYTE_LANES)
>> +               ret_val |= (1 << ((j / NUM_BYTE_LANES) * NUM_BYTE_LANES));
>> +
>> +       /*
>> +        * HSD#235037
>> +        * need to adjust the mask for 16-bit mode
>> +        */
>> +       if (mrc_params->channel_width == X16)
>> +               ret_val |= (ret_val << 2);
>> +
>> +       return ret_val;
>> +}
>> +
>> +/*
>> + * Check memory executing simple write/read/verify at the specified address.
>> + *
>> + * Bits in the result indicate failure on specific byte lane.
>> + */
>> +uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address)
>> +{
>> +       uint32_t result = 0;
>> +       uint8_t first_run = 0;
>> +
>> +       if (mrc_params->hte_setup) {
>> +               mrc_params->hte_setup = 0;
>> +               first_run = 1;
>> +               select_hte();
>> +       }
>> +
>> +       result = hte_basic_write_read(mrc_params, address,
>> +                                     first_run, WRITE_TRAIN);
>
> reformat to 80cols

Fixed.

>> +
>> +       DPF(D_TRN, "check_rw_coarse result is %x\n", result);
>> +
>> +       return result;
>> +}
>> +
>> +/*
>> + * Check memory executing write/read/verify of many data patterns
>> + * at the specified address. Bits in the result indicate failure
>> + * on specific byte lane.
>> + */
>> +uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address)
>> +{
>> +       uint32_t result;
>> +       uint8_t first_run = 0;
>> +
>> +       if (mrc_params->hte_setup) {
>> +               mrc_params->hte_setup = 0;
>> +               first_run = 1;
>> +               select_hte();
>> +       }
>> +
>> +       result = hte_write_stress_bit_lanes(mrc_params, address, first_run);
>> +
>> +       DPF(D_TRN, "check_bls_ex result is %x\n", result);
>> +
>> +       return result;
>> +}
>> +
>> +/*
>> + * 32-bit LFSR with characteristic polynomial: X^32 + X^22 +X^2 + X^1
>> + *
>> + * The function takes pointer to previous 32 bit value and
>> + * modifies it to next value.
>> + */
>> +void lfsr32(uint32_t *lfsr_ptr)
>> +{
>> +       uint32_t bit;
>> +       uint32_t lfsr;
>> +       int i;
>> +
>> +       lfsr = *lfsr_ptr;
>> +
>> +       for (i = 0; i < 32; i++) {
>> +               bit = 1 ^ (lfsr & BIT0);
>> +               bit = bit ^ ((lfsr & BIT1) >> 1);
>> +               bit = bit ^ ((lfsr & BIT2) >> 2);
>> +               bit = bit ^ ((lfsr & BIT22) >> 22);
>> +
>> +               lfsr = ((lfsr >> 1) | (bit << 31));
>> +       }
>> +
>> +       *lfsr_ptr = lfsr;
>> +}
>> +
>> +/* Clear the pointers in a given byte lane in a given channel */
>> +void clear_pointers(void)
>> +{
>> +       uint8_t channel;
>> +       uint8_t bl;
>> +
>> +       ENTERFN();
>> +
>> +       for (channel = 0; channel < NUM_CHANNELS; channel++) {
>> +               for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                                          (B01PTRCTL1 +
>> +                                          (channel * DDRIODQ_CH_OFFSET) +
>> +                                          ((bl >> 1) * DDRIODQ_BL_OFFSET)),
>> +                                          ~BIT8, BIT8);
>> +
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                                          (B01PTRCTL1 +
>> +                                          (channel * DDRIODQ_CH_OFFSET) +
>> +                                          ((bl >> 1) * DDRIODQ_BL_OFFSET)),
>> +                                          BIT8, BIT8);
>> +               }
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +void print_timings(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t algo;
>> +       uint8_t channel;
>> +       uint8_t rank;
>> +       uint8_t bl;
>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +
>> +       DPF(D_INFO, "\n---------------------------");
>> +       DPF(D_INFO, "\nALGO[CH:RK] BL0 BL1 BL2 BL3");
>> +       DPF(D_INFO, "\n===========================");
>> +
>> +       for (algo = 0; algo < MAX_ALGOS; algo++) {
>> +               for (channel = 0; channel < NUM_CHANNELS; channel++) {
>> +                       if (mrc_params->channel_enables & (1 << channel)) {
>> +                               for (rank = 0; rank < NUM_RANKS; rank++) {
>
> Can we put this block in its own function to fix the over-indenting?

Fixed.

>> +                                       if (mrc_params->rank_enables &
>> +                                               (1 << rank)) {
>> +                                               switch (algo) {
>> +                                               case RCVN:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nRCVN[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case WDQS:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nWDQS[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case WDQX:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nWDQx[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case RDQS:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nRDQS[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case VREF:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nVREF[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case WCMD:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nWCMD[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case WCTL:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nWCTL[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               case WCLK:
>> +                                                       DPF(D_INFO,
>> +                                                           "\nWCLK[%02d:%02d]",
>> +                                                           channel, rank);
>> +                                                       break;
>> +                                               default:
>> +                                                       break;
>> +                                               }
>> +
>> +                                               for (bl = 0;
>> +                                                    bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                                    bl++) {
>> +                                                       switch (algo) {
>> +                                                       case RCVN:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_rcvn(channel, rank, bl));
>> +                                                               break;
>> +                                                       case WDQS:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_wdqs(channel, rank, bl));
>> +                                                               break;
>> +                                                       case WDQX:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_wdq(channel, rank, bl));
>> +                                                               break;
>> +                                                       case RDQS:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_rdqs(channel, rank, bl));
>> +                                                               break;
>> +                                                       case VREF:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_vref(channel, bl));
>> +                                                               break;
>> +                                                       case WCMD:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_wcmd(channel));
>> +                                                               break;
>> +                                                       case WCTL:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_wctl(channel, rank));
>> +                                                               break;
>> +                                                       case WCLK:
>> +                                                               DPF(D_INFO,
>> +                                                                   " %03d",
>> +                                                                   get_wclk(channel, rank));
>> +                                                               break;
>> +                                                       default:
>> +                                                               break;
>> +                                                       }
>> +                                               }
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       DPF(D_INFO, "\n---------------------------");
>> +       DPF(D_INFO, "\n");
>> +}
>> diff --git a/arch/x86/cpu/quark/mrc_util.h b/arch/x86/cpu/quark/mrc_util.h
>> new file mode 100644
>> index 0000000..edbe219
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/mrc_util.h
>> @@ -0,0 +1,153 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/
>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#ifndef _MRC_UTIL_H_
>> +#define _MRC_UTIL_H_
>> +
>> +/* Turn on this macro to enable MRC debugging output */
>> +#undef  MRC_DEBUG
>> +
>> +/* MRC Debug Support */
>> +#define DPF            debug_cond
>> +
>> +/* debug print type */
>> +
>> +#ifdef MRC_DEBUG
>> +#define D_ERROR                0x0001
>> +#define D_INFO         0x0002
>> +#define D_REGRD                0x0004
>> +#define D_REGWR                0x0008
>> +#define D_FCALL                0x0010
>> +#define D_TRN          0x0020
>> +#define D_TIME         0x0040
>> +#else
>> +#define D_ERROR                0
>> +#define D_INFO         0
>> +#define D_REGRD                0
>> +#define D_REGWR                0
>> +#define D_FCALL                0
>> +#define D_TRN          0
>> +#define D_TIME         0
>> +#endif
>> +
>> +#define ENTERFN(...)   debug_cond(D_FCALL, "<%s>\n", __func__)
>> +#define LEAVEFN(...)   debug_cond(D_FCALL, "</%s>\n", __func__)
>> +#define REPORTFN(...)  debug_cond(D_FCALL, "<%s/>\n", __func__)
>> +
>> +/* Generic Register Bits */
>> +#define BIT0           0x00000001
>> +#define BIT1           0x00000002
>> +#define BIT2           0x00000004
>> +#define BIT3           0x00000008
>> +#define BIT4           0x00000010
>> +#define BIT5           0x00000020
>> +#define BIT6           0x00000040
>> +#define BIT7           0x00000080
>> +#define BIT8           0x00000100
>> +#define BIT9           0x00000200
>> +#define BIT10          0x00000400
>> +#define BIT11          0x00000800
>> +#define BIT12          0x00001000
>> +#define BIT13          0x00002000
>> +#define BIT14          0x00004000
>> +#define BIT15          0x00008000
>> +#define BIT16          0x00010000
>> +#define BIT17          0x00020000
>> +#define BIT18          0x00040000
>> +#define BIT19          0x00080000
>> +#define BIT20          0x00100000
>> +#define BIT21          0x00200000
>> +#define BIT22          0x00400000
>> +#define BIT23          0x00800000
>> +#define BIT24          0x01000000
>> +#define BIT25          0x02000000
>> +#define BIT26          0x04000000
>> +#define BIT27          0x08000000
>> +#define BIT28          0x10000000
>> +#define BIT29          0x20000000
>> +#define BIT30          0x40000000
>> +#define BIT31          0x80000000
>> +
>> +/* Message Bus Port */
>> +#define MEM_CTLR       0x01
>> +#define HOST_BRIDGE    0x03
>> +#define MEM_MGR                0x05
>> +#define HTE            0x11
>> +#define DDRPHY         0x12
>> +
>> +/* number of sample points */
>> +#define SAMPLE_CNT     3
>> +/* number of PIs to increment per sample */
>> +#define SAMPLE_DLY     26
>> +
>> +enum {
>> +       /* indicates to decrease delays when looking for edge */
>> +       BACKWARD,
>> +       /* indicates to increase delays when looking for edge */
>> +       FORWARD
>> +};
>> +
>> +enum {
>> +       RCVN,
>> +       WDQS,
>> +       WDQX,
>> +       RDQS,
>> +       VREF,
>> +       WCMD,
>> +       WCTL,
>> +       WCLK,
>> +       MAX_ALGOS,
>> +};
>> +
>> +void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask);
>> +void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask);
>> +void mrc_post_code(uint8_t major, uint8_t minor);
>> +void delay_n(uint32_t ns);
>> +void delay_u(uint32_t ms);
>> +void select_mem_mgr(void);
>> +void select_hte(void);
>> +void dram_init_command(uint32_t data);
>> +void dram_wake_command(void);
>> +void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane);
>> +
>> +void set_rcvn(uint8_t channel, uint8_t rank,
>> +             uint8_t byte_lane, uint32_t pi_count);
>> +uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane);
>> +void set_rdqs(uint8_t channel, uint8_t rank,
>> +             uint8_t byte_lane, uint32_t pi_count);
>> +uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane);
>> +void set_wdqs(uint8_t channel, uint8_t rank,
>> +             uint8_t byte_lane, uint32_t pi_count);
>> +uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane);
>> +void set_wdq(uint8_t channel, uint8_t rank,
>> +            uint8_t byte_lane, uint32_t pi_count);
>> +uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane);
>> +void set_wcmd(uint8_t channel, uint32_t pi_count);
>> +uint32_t get_wcmd(uint8_t channel);
>> +void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count);
>> +uint32_t get_wclk(uint8_t channel, uint8_t rank);
>> +void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count);
>> +uint32_t get_wctl(uint8_t channel, uint8_t rank);
>> +void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting);
>> +uint32_t get_vref(uint8_t channel, uint8_t byte_lane);
>> +
>> +uint32_t get_addr(uint8_t channel, uint8_t rank);
>> +uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel,
>> +                   uint8_t rank, bool rcvn);
>> +void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[],
>> +                     uint8_t channel, uint8_t rank, bool rcvn);
>> +uint32_t byte_lane_mask(struct mrc_params *mrc_params);
>> +uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address);
>> +uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address);
>> +void lfsr32(uint32_t *lfsr_ptr);
>> +void clear_pointers(void);
>> +void print_timings(struct mrc_params *mrc_params);
>
> If these are all truly exported, can we please put the function
> comments here in the header file?

No, they are only used internally by MRC.

>> +
>> +#endif /* _MRC_UTIL_H_ */
>> --
>> 1.8.2.1
>>
>
> Regards,
> Simon

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support
  2015-02-04 16:24   ` Simon Glass
@ 2015-02-05 15:17     ` Bin Meng
  0 siblings, 0 replies; 29+ messages in thread
From: Bin Meng @ 2015-02-05 15:17 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Thu, Feb 5, 2015 at 12:24 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>> The codes are actually doing the memory initialization stuff.
>>
>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>>
>> ---
>> The most ugly codes I've ever seen ...
>> There are 252 warnings and 127 checks in this patch, which are:
>>
>> check: arch/x86/cpu/quark/smc.c,1609: Alignment should match open parenthesis
>> warning: arch/x86/cpu/quark/smc.c,1610: line over 80 characters
>> warning: arch/x86/cpu/quark/smc.c,1633: Too many leading tabs - consider code refactoring
>> ...
>>
>> Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have
>> all the details on how Intel's MRC codes are actually written to play with
>> the hardware. Trying to refactor them may lead to a non-working MRC codes.
>> For the 'line over 80 characters' issue, we have to leave them as is now
>> due to the 'Too many leading tabs ...'. If I am trying to fix the 'Alignment
>> should match open parenthesis' issue, I may end up adding more 'line over 80
>> characters' issues, so we have to bear with it. Sigh.
>
> Understood. Will try to limit my comments.
>
>>
>>  arch/x86/cpu/quark/smc.c | 2764 ++++++++++++++++++++++++++++++++++++++++++++++
>>  arch/x86/cpu/quark/smc.h |  446 ++++++++
>>  2 files changed, 3210 insertions(+)
>>  create mode 100644 arch/x86/cpu/quark/smc.c
>>  create mode 100644 arch/x86/cpu/quark/smc.h
>>
>> diff --git a/arch/x86/cpu/quark/smc.c b/arch/x86/cpu/quark/smc.c
>> new file mode 100644
>> index 0000000..fb389cd
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/smc.c
>> @@ -0,0 +1,2764 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/

Removed the ending / in v2.

>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#include <common.h>
>> +#include <pci.h>
>> +#include <asm/arch/device.h>
>> +#include <asm/arch/mrc.h>
>> +#include <asm/arch/msg_port.h>
>> +#include "mrc_util.h"
>> +#include "hte.h"
>> +#include "smc.h"
>> +
>> +/* t_rfc values (in picoseconds) per density */
>> +static const uint32_t t_rfc[5] = {
>> +       90000,  /* 512Mb */
>> +       110000, /* 1Gb */
>> +       160000, /* 2Gb */
>> +       300000, /* 4Gb */
>> +       350000, /* 8Gb */
>> +};
>> +
>> +/* t_ck clock period in picoseconds per speed index 800, 1066, 1333 */
>> +static const uint32_t t_ck[3] = {
>> +       2500,
>> +       1875,
>> +       1500
>> +};
>> +
>> +/* Global variables */
>> +static const uint16_t ddr_wclk[] = {193, 158};
>> +static const uint16_t ddr_wctl[] = {1, 217};
>> +static const uint16_t ddr_wcmd[] = {1, 220};
>> +
>> +#ifdef BACKUP_RCVN
>> +static const uint16_t ddr_rcvn[] = {129, 498};
>> +#endif
>> +
>> +#ifdef BACKUP_WDQS
>> +static const uint16_t ddr_wdqs[] = {65, 289};
>> +#endif
>> +
>> +#ifdef BACKUP_RDQS
>> +static const uint8_t ddr_rdqs[] = {32, 24};
>> +#endif
>> +
>> +#ifdef BACKUP_WDQ
>> +static const uint16_t ddr_wdq[] = {32, 257};
>> +#endif
>> +
>> +/* Stop self refresh driven by MCU */
>> +void clear_self_refresh(struct mrc_params *mrc_params)
>> +{
>> +       ENTERFN();
>> +
>> +       /* clear the PMSTS Channel Self Refresh bits */
>> +       mrc_write_mask(MEM_CTLR, PMSTS, BIT0, BIT0);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/* It will initialise timing registers in the MCU (DTR0..DTR4) */
>> +void prog_ddr_timing_control(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t tcl, wl;
>> +       uint8_t trp, trcd, tras, twr, twtr, trrd, trtp, tfaw;
>> +       uint32_t tck;
>> +       u32 dtr0, dtr1, dtr2, dtr3, dtr4;
>> +       u32 tmp1, tmp2;
>> +
>> +       ENTERFN();
>> +
>> +       /* mcu_init starts */
>> +       mrc_post_code(0x02, 0x00);
>> +
>> +       dtr0 = msg_port_read(MEM_CTLR, DTR0);
>> +       dtr1 = msg_port_read(MEM_CTLR, DTR1);
>> +       dtr2 = msg_port_read(MEM_CTLR, DTR2);
>> +       dtr3 = msg_port_read(MEM_CTLR, DTR3);
>> +       dtr4 = msg_port_read(MEM_CTLR, DTR4);
>> +
>> +       tck = t_ck[mrc_params->ddr_speed];      /* Clock in picoseconds */
>> +       tcl = mrc_params->params.cl;            /* CAS latency in clocks */
>> +       trp = tcl;      /* Per CAT MRC */
>> +       trcd = tcl;     /* Per CAT MRC */
>> +       tras = MCEIL(mrc_params->params.ras, tck);
>> +
>> +       /* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */
>> +       twr = MCEIL(15000, tck);
>> +
>> +       twtr = MCEIL(mrc_params->params.wtr, tck);
>> +       trrd = MCEIL(mrc_params->params.rrd, tck);
>> +       trtp = 4;       /* Valid for 800 and 1066, use 5 for 1333 */
>> +       tfaw = MCEIL(mrc_params->params.faw, tck);
>> +
>> +       wl = 5 + mrc_params->ddr_speed;
>> +
>> +       dtr0 &= ~(BIT0 | BIT1);
>> +       dtr0 |= mrc_params->ddr_speed;
>> +       dtr0 &= ~(BIT12 | BIT13 | BIT14);
>> +       tmp1 = tcl - 5;
>> +       dtr0 |= ((tcl - 5) << 12);
>> +       dtr0 &= ~(BIT4 | BIT5 | BIT6 | BIT7);
>> +       dtr0 |= ((trp - 5) << 4);       /* 5 bit DRAM Clock */
>> +       dtr0 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
>> +       dtr0 |= ((trcd - 5) << 8);      /* 5 bit DRAM Clock */
>> +
>> +       dtr1 &= ~(BIT0 | BIT1 | BIT2);
>> +       tmp2 = wl - 3;
>> +       dtr1 |= (wl - 3);
>> +       dtr1 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
>> +       dtr1 |= ((wl + 4 + twr - 14) << 8);     /* Change to tWTP */
>> +       dtr1 &= ~(BIT28 | BIT29 | BIT30);
>> +       dtr1 |= ((MMAX(trtp, 4) - 3) << 28);    /* 4 bit DRAM Clock */
>> +       dtr1 &= ~(BIT24 | BIT25);
>> +       dtr1 |= ((trrd - 4) << 24);             /* 4 bit DRAM Clock */
>> +       dtr1 &= ~(BIT4 | BIT5);
>> +       dtr1 |= (1 << 4);
>> +       dtr1 &= ~(BIT20 | BIT21 | BIT22 | BIT23);
>> +       dtr1 |= ((tras - 14) << 20);            /* 6 bit DRAM Clock */
>> +       dtr1 &= ~(BIT16 | BIT17 | BIT18 | BIT19);
>> +       dtr1 |= ((((tfaw + 1) >> 1) - 5) << 16);/* 4 bit DRAM Clock */
>> +       /* Set 4 Clock CAS to CAS delay (multi-burst) */
>> +       dtr1 &= ~(BIT12 | BIT13);
>> +
>> +       dtr2 &= ~(BIT0 | BIT1 | BIT2);
>> +       dtr2 |= 1;
>> +       dtr2 &= ~(BIT8 | BIT9 | BIT10);
>> +       dtr2 |= (2 << 8);
>> +       dtr2 &= ~(BIT16 | BIT17 | BIT18 | BIT19);
>> +       dtr2 |= (2 << 16);
>> +
>> +       dtr3 &= ~(BIT0 | BIT1 | BIT2);
>> +       dtr3 |= 2;
>> +       dtr3 &= ~(BIT4 | BIT5 | BIT6);
>> +       dtr3 |= (2 << 4);
>> +
>> +       dtr3 &= ~(BIT8 | BIT9 | BIT10 | BIT11);
>> +       if (mrc_params->ddr_speed == DDRFREQ_800) {
>> +               /* Extended RW delay (+1) */
>> +               dtr3 |= ((tcl - 5 + 1) << 8);
>> +       } else if (mrc_params->ddr_speed == DDRFREQ_1066) {
>> +               /* Extended RW delay (+1) */
>> +               dtr3 |= ((tcl - 5 + 1) << 8);
>> +       }
>> +
>> +       dtr3 &= ~(BIT13 | BIT14 | BIT15 | BIT16);
>> +       dtr3 |= ((4 + wl + twtr - 11) << 13);
>> +
>> +       dtr3 &= ~(BIT22 | BIT23);
>> +       if (mrc_params->ddr_speed == DDRFREQ_800)
>> +               dtr3 |= ((MMAX(0, 1 - 1)) << 22);
>> +       else
>> +               dtr3 |= ((MMAX(0, 2 - 1)) << 22);
>> +
>> +       dtr4 &= ~(BIT0 | BIT1);
>> +       dtr4 |= 1;
>> +       dtr4 &= ~(BIT4 | BIT5 | BIT6);
>> +       dtr4 |= (1 << 4);
>> +       dtr4 &= ~(BIT8 | BIT9 | BIT10);
>> +       dtr4 |= ((1 + tmp1 - tmp2 + 2) << 8);
>> +       dtr4 &= ~(BIT12 | BIT13 | BIT14);
>> +       dtr4 |= ((1 + tmp1 - tmp2 + 2) << 12);
>> +       dtr4 &= ~(BIT15 | BIT16);
>> +
>> +       msg_port_write(MEM_CTLR, DTR0, dtr0);
>> +       msg_port_write(MEM_CTLR, DTR1, dtr1);
>> +       msg_port_write(MEM_CTLR, DTR2, dtr2);
>> +       msg_port_write(MEM_CTLR, DTR3, dtr3);
>> +       msg_port_write(MEM_CTLR, DTR4, dtr4);
>
> This bit stuff is a mess. It obscures the meaning IMO and we would be
> much better off with proper named #defines. What can we do here?

I agree it is a mess. To me DDR memory intialization itself is a black
magic, with complicated hardware technology so we get ugly software
codes in accompany with :( The MRC codes is a collection of analog
electronics that are more similar to radio oscillators than to digital
circuits appear to be a digital domain memory controller... These
register fields are pure DDR timing numbers. I cannot find a better
way of doing that, thus leave this unchanged in v2.

>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/* Configure MCU before jedec init sequence */
>> +void prog_decode_before_jedec(struct mrc_params *mrc_params)
>> +{
>> +       u32 drp;
>> +       u32 drfc;
>> +       u32 dcal;
>> +       u32 dsch;
>> +       u32 dpmc0;
>> +
>> +       ENTERFN();
>> +
>> +       /* Disable power saving features */
>> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
>> +       dpmc0 |= (BIT24 | BIT25);
>> +       dpmc0 &= ~(BIT16 | BIT17 | BIT18);
>> +       dpmc0 &= ~BIT23;
>> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
>> +
>> +       /* Disable out of order transactions */
>> +       dsch = msg_port_read(MEM_CTLR, DSCH);
>> +       dsch |= (BIT8 | BIT12);
>> +       msg_port_write(MEM_CTLR, DSCH, dsch);
>> +
>> +       /* Disable issuing the REF command */
>> +       drfc = msg_port_read(MEM_CTLR, DRFC);
>> +       drfc &= ~(BIT12 | BIT13 | BIT14);
>> +       msg_port_write(MEM_CTLR, DRFC, drfc);
>> +
>> +       /* Disable ZQ calibration short */
>> +       dcal = msg_port_read(MEM_CTLR, DCAL);
>> +       dcal &= ~(BIT8 | BIT9 | BIT10);
>> +       dcal &= ~(BIT12 | BIT13);
>> +       msg_port_write(MEM_CTLR, DCAL, dcal);
>> +
>> +       /*
>> +        * Training performed in address mode 0, rank population has limited
>> +        * impact, however simulator complains if enabled non-existing rank.
>> +        */
>> +       drp = 0;
>> +       if (mrc_params->rank_enables & 1)
>> +               drp |= BIT0;
>> +       if (mrc_params->rank_enables & 2)
>> +               drp |= BIT1;
>> +       msg_port_write(MEM_CTLR, DRP, drp);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * After Cold Reset, BIOS should set COLDWAKE bit to 1 before
>> + * sending the WAKE message to the Dunit.
>> + *
>> + * For Standby Exit, or any other mode in which the DRAM is in
>> + * SR, this bit must be set to 0.
>> + */
>> +void perform_ddr_reset(struct mrc_params *mrc_params)
>> +{
>> +       ENTERFN();
>> +
>> +       /* Set COLDWAKE bit before sending the WAKE message */
>> +       mrc_write_mask(MEM_CTLR, DRMC, BIT16, BIT16);
>> +
>> +       /* Send wake command to DUNIT (MUST be done before JEDEC) */
>> +       dram_wake_command();
>> +
>> +       /* Set default value */
>> +       msg_port_write(MEM_CTLR, DRMC,
>> +                      (mrc_params->rd_odt_value == 0 ? BIT12 : 0));
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +
>> +/*
>> + * This function performs some initialization on the DDRIO unit.
>> + * This function is dependent on BOARD_ID, DDR_SPEED, and CHANNEL_ENABLES.
>> + */
>> +void ddrphy_init(struct mrc_params *mrc_params)
>> +{
>> +       uint32_t temp;
>> +       uint8_t ch;     /* channel counter */
>> +       uint8_t rk;     /* rank counter */
>> +       uint8_t bl_grp; /*  byte lane group counter (2 BLs per module) */
>> +       uint8_t bl_divisor = 1; /* byte lane divisor */
>> +       /* For DDR3 --> 0 == 800, 1 == 1066, 2 == 1333 */
>> +       uint8_t speed = mrc_params->ddr_speed & (BIT1 | BIT0);
>> +       uint8_t cas;
>> +       uint8_t cwl;
>> +
>> +       ENTERFN();
>> +
>> +       cas = mrc_params->params.cl;
>> +       cwl = 5 + mrc_params->ddr_speed;
>> +
>> +       /* ddrphy_init starts */
>> +       mrc_post_code(0x03, 0x00);
>> +
>> +       /*
>> +        * HSD#231531
>> +        * Make sure IOBUFACT is deasserted before initializing the DDR PHY
>> +        *
>> +        * HSD#234845
>> +        * Make sure WRPTRENABLE is deasserted before initializing the DDR PHY
>> +        */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* Deassert DDRPHY Initialization Complete */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ~BIT20, BIT20); /* SPID_INIT_COMPLETE=0 */
>> +                       /* Deassert IOBUFACT */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ~BIT2, BIT2);   /* IOBUFACTRST_N=0 */
>> +                       /* Disable WRPTR */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ~BIT0, BIT0);   /* WRPTRENABLE=0 */
>> +               }
>> +       }
>> +
>> +       /* Put PHY in reset */
>> +       mrc_alt_write_mask(DDRPHY, MASTERRSTN, 0, BIT0);
>> +
>> +       /* Initialize DQ01, DQ23, CMD, CLK-CTL, COMP modules */
>> +
>> +       /* STEP0 */
>
> Can you put each step in its own static function?
>
> for (ch = 0; ch < NUM_CHANNELS; ch++)
>     step0(ch);
> for (ch = 0; ch < NUM_CHANNELS; ch++)
>     step1(ch);
>
> etc.

I am afraid it is not that simple. We need pass lots of variable to
these static functions as parameters. I feel it does no good, instead
creating possiblity of breaking the MRC, thus I left it unchanged.

>> +       mrc_post_code(0x03, 0x10);
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* DQ01-DQ23 */
>> +                       for (bl_grp = 0;
>> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor)/2);
>> +                            bl_grp++) {
>> +                               /* Analog MUX select - IO2xCLKSEL */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQOBSCKEBBCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       ((bl_grp) ? (0x00) : (BIT22)), (BIT22));
>> +
>> +                               /* ODT Strength */
>> +                               switch (mrc_params->rd_odt_value) {
>> +                               case 1:
>> +                                       temp = 0x3;
>> +                                       break;  /* 60 ohm */
>> +                               case 2:
>> +                                       temp = 0x3;
>> +                                       break;  /* 120 ohm */
>> +                               case 3:
>> +                                       temp = 0x3;
>> +                                       break;  /* 180 ohm */
>> +                               default:
>> +                                       temp = 0x3;
>> +                                       break;  /* 120 ohm */
>> +                               }
>> +
>> +                               /* ODT strength */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0RXIOBUFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (temp << 5), (BIT6 | BIT5));
>> +                               /* ODT strength */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1RXIOBUFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (temp << 5), (BIT6 | BIT5));
>> +
>> +                               /* Dynamic ODT/DIFFAMP */
>> +                               temp = (((cas) << 24) | ((cas) << 16) |
>> +                                       ((cas) << 8) | ((cas) << 0));
>> +                               switch (speed) {
>> +                               case 0:
>> +                                       temp -= 0x01010101;
>> +                                       break;  /* 800 */
>> +                               case 1:
>> +                                       temp -= 0x02020202;
>> +                                       break;  /* 1066 */
>> +                               case 2:
>> +                                       temp -= 0x03030303;
>> +                                       break;  /* 1333 */
>> +                               case 3:
>> +                                       temp -= 0x04040404;
>> +                                       break;  /* 1600 */
>> +                               }
>> +
>> +                               /* Launch Time: ODT, DIFFAMP, ODT, DIFFAMP */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B01LATCTL1 +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       temp,
>> +                                       (BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
>> +                                       BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
>> +                                       BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +                                       BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
>> +                               switch (speed) {
>> +                               /* HSD#234715 */
>> +                               case 0:
>> +                                       temp = ((0x06 << 16) | (0x07 << 8));
>> +                                       break;  /* 800 */
>> +                               case 1:
>> +                                       temp = ((0x07 << 16) | (0x08 << 8));
>> +                                       break;  /* 1066 */
>> +                               case 2:
>> +                                       temp = ((0x09 << 16) | (0x0A << 8));
>> +                                       break;  /* 1333 */
>> +                               case 3:
>> +                                       temp = ((0x0A << 16) | (0x0B << 8));
>> +                                       break;  /* 1600 */
>> +                               }
>> +
>> +                               /* On Duration: ODT, DIFFAMP */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0ONDURCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       temp,
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                                       BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
>> +                                       BIT9 | BIT8));
>> +                               /* On Duration: ODT, DIFFAMP */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1ONDURCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       temp,
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                                       BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
>> +                                       BIT9 | BIT8));
>> +
>> +                               switch (mrc_params->rd_odt_value) {
>> +                               case 0:
>> +                                       /* override DIFFAMP=on, ODT=off */
>> +                                       temp = ((0x3F << 16) | (0x3f << 10));
>> +                                       break;
>> +                               default:
>> +                                       /* override DIFFAMP=on, ODT=on */
>> +                                       temp = ((0x3F << 16) | (0x2A << 10));
>> +                                       break;
>> +                               }
>> +
>> +                               /* Override: DIFFAMP, ODT */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0OVRCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       temp,
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                                       BIT16 | BIT15 | BIT14 | BIT13 | BIT12 |
>> +                                       BIT11 | BIT10));
>> +                               /* Override: DIFFAMP, ODT */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1OVRCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       temp,
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                                       BIT16 | BIT15 | BIT14 | BIT13 | BIT12 |
>> +                                       BIT11 | BIT10));
>> +
>> +                               /* DLL Setup */
>> +
>> +                               /* 1xCLK Domain Timings: tEDP,RCVEN,WDQS (PO) */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0LATCTL0 +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (((cas + 7) << 16) | ((cas - 4) << 8) |
>> +                                       ((cwl - 2) << 0)),
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                                       BIT16 | BIT12 | BIT11 | BIT10 | BIT9 |
>> +                                       BIT8 | BIT4 | BIT3 | BIT2 | BIT1 |
>> +                                       BIT0));
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1LATCTL0 +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (((cas + 7) << 16) | ((cas - 4) << 8) |
>> +                                       ((cwl - 2) << 0)),
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                                       BIT16 | BIT12 | BIT11 | BIT10 | BIT9 |
>> +                                       BIT8 | BIT4 | BIT3 | BIT2 | BIT1 |
>> +                                       BIT0));
>> +
>> +                               /* RCVEN Bypass (PO) */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0RXIOBUFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       ((0x0 << 7) | (0x0 << 0)),
>> +                                       (BIT7 | BIT0));
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1RXIOBUFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       ((0x0 << 7) | (0x0 << 0)),
>> +                                       (BIT7 | BIT0));
>> +
>> +                               /* TX */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (BIT16), (BIT16));
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B01PTRCTL1 +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (BIT8), (BIT8));
>> +
>> +                               /* RX (PO) */
>> +                               /* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0VREFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       ((0x03 << 2) | (0x0 << 1) | (0x0 << 0)),
>> +                                       (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
>> +                                       BIT2 | BIT1 | BIT0));
>> +                               /* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1VREFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       ((0x03 << 2) | (0x0 << 1) | (0x0 << 0)),
>> +                                       (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
>> +                                       BIT2 | BIT1 | BIT0));
>> +                               /* Per-Bit De-Skew Enable */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0RXIOBUFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (0), (BIT4));
>> +                               /* Per-Bit De-Skew Enable */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1RXIOBUFCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (0), (BIT4));
>> +                       }
>> +
>> +                       /* CLKEBB */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               0, (BIT23));
>> +
>> +                       /* Enable tristate control of cmd/address bus */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               0, (BIT1 | BIT0));
>> +
>> +                       /* ODT RCOMP */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x03 << 5) | (0x03 << 0)),
>> +                               (BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 |
>> +                               BIT3 | BIT2 | BIT1 | BIT0));
>> +
>> +                       /* CMDPM* registers must be programmed in this order */
>> +
>> +                       /* Turn On Delays: SFR (regulator), MPLL */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMDLYREG4 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0xFFFFU << 16) | (0xFFFF << 0)),
>> +                               0xFFFFFFFF);
>> +                       /*
>> +                        * Delays: ASSERT_IOBUFACT_to_ALLON0_for_PM_MSG_3,
>> +                        * VREG (MDLL) Turn On, ALLON0_to_DEASSERT_IOBUFACT
>> +                        * for_PM_MSG_gt0, MDLL Turn On
>> +                        */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMDLYREG3 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0xFU << 28) | (0xFFF << 16) | (0xF << 12) |
>> +                               (0x616 << 0)), 0xFFFFFFFF);
>> +                       /* MPLL Divider Reset Delays */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMDLYREG2 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
>> +                               (0xFF << 0)), 0xFFFFFFFF);
>> +                       /* Turn Off Delays: VREG, Staggered MDLL, MDLL, PI */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMDLYREG1 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
>> +                               (0xFF << 0)), 0xFFFFFFFF);
>> +                       /* Turn On Delays: MPLL, Staggered MDLL, PI, IOBUFACT */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMDLYREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) |
>> +                               (0xFF << 0)), 0xFFFFFFFF);
>> +                       /* Allow PUnit signals */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x6 << 8) | BIT6 | (0x4 << 0)),
>> +                               (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 |
>> +                               BIT25 | BIT24 | BIT23 | BIT22 | BIT21 | BIT11 |
>> +                               BIT10 | BIT9 | BIT8 | BIT6 | BIT3 | BIT2 |
>> +                               BIT1 | BIT0));
>> +                       /* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x3 << 4) | (0x7 << 0)),
>> +                               (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 |
>> +                               BIT0));
>> +
>> +                       /* CLK-CTL */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               0, BIT24);      /* CLKEBB */
>> +                       /* Buffer Enable: CS,CKE,ODT,CLK */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x0 << 16) | (0x0 << 12) | (0x0 << 8) |
>> +                               (0xF << 4) | BIT0),
>> +                               (BIT19 | BIT18 | BIT17 | BIT16 | BIT15 | BIT14 |
>> +                               BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +                               BIT7 | BIT6 | BIT5 | BIT4 | BIT0));
>> +                       /* ODT RCOMP */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x03 << 8) | (0x03 << 0)),
>> +                               (BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | BIT4 |
>> +                               BIT3 | BIT2 | BIT1 | BIT0));
>> +                       /* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x3 << 4) | (0x7 << 0)),
>> +                               (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 |
>> +                               BIT0));
>> +
>> +                       /*
>> +                        * COMP (RON channel specific)
>> +                        * - DQ/DQS/DM RON: 32 Ohm
>> +                        * - CTRL/CMD RON: 27 Ohm
>> +                        * - CLK RON: 26 Ohm
>> +                        */
>> +                       /* RCOMP Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQVREFCH0 +  (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x08 << 24) | (0x03 << 16)),
>> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
>> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
>> +                               BIT17 | BIT16));
>> +                       /* RCOMP Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x0C << 24) | (0x03 << 16)),
>> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
>> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
>> +                               BIT17 | BIT16));
>> +                       /* RCOMP Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x0F << 24) | (0x03 << 16)),
>> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
>> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
>> +                               BIT17 | BIT16));
>> +                       /* RCOMP Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x08 << 24) | (0x03 << 16)),
>> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
>> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
>> +                               BIT17 | BIT16));
>> +                       /* RCOMP Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CTLVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x0C << 24) | (0x03 << 16)),
>> +                               (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 |
>> +                               BIT24 | BIT21 | BIT20 | BIT19 | BIT18 |
>> +                               BIT17 | BIT16));
>> +
>> +                       /* DQS Swapped Input Enable */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (COMPEN1CH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT19 | BIT17),
>> +                               (BIT31 | BIT30 | BIT19 | BIT17 |
>> +                               BIT15 | BIT14));
>> +
>> +                       /* ODT VREF = 1.5 x 274/360+274 = 0.65V (code of ~50) */
>> +                       /* ODT Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x32 << 8) | (0x03 << 0)),
>> +                               (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +                               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
>> +                       /* ODT Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x32 << 8) | (0x03 << 0)),
>> +                               (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +                               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
>> +                       /* ODT Vref PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x0E << 8) | (0x05 << 0)),
>> +                               (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +                               BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
>> +
>> +                       /*
>> +                        * Slew rate settings are frequency specific,
>> +                        * numbers below are for 800Mhz (speed == 0)
>> +                        * - DQ/DQS/DM/CLK SR: 4V/ns,
>> +                        * - CTRL/CMD SR: 1.5V/ns
>> +                        */
>> +                       temp = (0x0E << 16) | (0x0E << 12) | (0x08 << 8) |
>> +                               (0x0B << 4) | (0x0B << 0);
>> +                       /* DCOMP Delay Select: CTL,CMD,CLK,DQS,DQ */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DLYSELCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               temp,
>> +                               (BIT19 | BIT18 | BIT17 | BIT16 | BIT15 |
>> +                               BIT14 | BIT13 | BIT12 | BIT11 | BIT10 |
>> +                               BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 |
>> +                               BIT3 | BIT2 | BIT1 | BIT0));
>> +                       /* TCO Vref CLK,DQS,DQ */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (TCOVREFCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x05 << 16) | (0x05 << 8) | (0x05 << 0)),
>> +                               (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                               BIT16 | BIT13 | BIT12 | BIT11 | BIT10 |
>> +                               BIT9 | BIT8 | BIT5 | BIT4 | BIT3 | BIT2 |
>> +                               BIT1 | BIT0));
>> +                       /* ODTCOMP CMD/CTL PU/PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCBUFODTCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ((0x03 << 8) | (0x03 << 0)),
>> +                               (BIT12 | BIT11 | BIT10 | BIT9 | BIT8 |
>> +                               BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
>> +                       /* COMP */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               0, (BIT31 | BIT30 | BIT8));
>> +
>> +#ifdef BACKUP_COMPS
>> +                       /* DQ COMP Overrides */
>> +                       /* RCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* RCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x10 << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x10 << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* ODTCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* ODTCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* TCOCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31), (BIT31));
>> +                       /* TCOCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31), (BIT31));
>> +
>> +                       /* DQS COMP Overrides */
>> +                       /* RCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* RCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x10 << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x10 << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* ODTCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* ODTCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* TCOCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31), (BIT31));
>> +                       /* TCOCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31), (BIT31));
>> +
>> +                       /* CLK COMP Overrides */
>> +                       /* RCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0C << 16)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* RCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0C << 16)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x07 << 16)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x07 << 16)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* ODTCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* ODTCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | (0x0B << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* TCOCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31), (BIT31));
>> +                       /* TCOCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31), (BIT31));
>> +
>> +                       /* CMD COMP Overrides */
>> +                       /* RCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0D << 16)),
>> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* RCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0D << 16)),
>> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +
>> +                       /* CTL COMP Overrides */
>> +                       /* RCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CTLDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0D << 16)),
>> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* RCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CTLDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0D << 16)),
>> +                               (BIT31 | BIT21 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CTLDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* DCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CTLDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x0A << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +#else
>> +                       /* DQ TCOCOMP Overrides */
>> +                       /* TCOCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x1F << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* TCOCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x1F << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +
>> +                       /* DQS TCOCOMP Overrides */
>> +                       /* TCOCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x1F << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* TCOCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x1F << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +
>> +                       /* CLK TCOCOMP Overrides */
>> +                       /* TCOCOMP PU */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x1F << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +                       /* TCOCOMP PD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               (BIT31 | (0x1F << 16)),
>> +                               (BIT31 | BIT20 | BIT19 |
>> +                               BIT18 | BIT17 | BIT16));
>> +#endif
>> +
>> +                       /* program STATIC delays */
>> +#ifdef BACKUP_WCMD
>> +                       set_wcmd(ch, ddr_wcmd[PLATFORM_ID]);
>> +#else
>> +                       set_wcmd(ch, ddr_wclk[PLATFORM_ID] + HALF_CLK);
>> +#endif
>> +
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1<<rk)) {
>> +                                       set_wclk(ch, rk, ddr_wclk[PLATFORM_ID]);
>> +#ifdef BACKUP_WCTL
>> +                                       set_wctl(ch, rk, ddr_wctl[PLATFORM_ID]);
>> +#else
>> +                                       set_wctl(ch, rk, ddr_wclk[PLATFORM_ID] + HALF_CLK);
>> +#endif
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       /* COMP (non channel specific) */
>> +       /* RCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANADRVPUCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANADRVPDCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CMDANADRVPUCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CMDANADRVPDCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANADRVPUCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANADRVPDCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANADRVPUCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANADRVPDCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CTLANADRVPUCTL), (BIT30), (BIT30));
>> +       /* RCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CTLANADRVPDCTL), (BIT30), (BIT30));
>> +       /* ODT: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANAODTPUCTL), (BIT30), (BIT30));
>> +       /* ODT: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANAODTPDCTL), (BIT30), (BIT30));
>> +       /* ODT: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANAODTPUCTL), (BIT30), (BIT30));
>> +       /* ODT: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANAODTPDCTL), (BIT30), (BIT30));
>> +       /* ODT: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANAODTPUCTL), (BIT30), (BIT30));
>> +       /* ODT: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANAODTPDCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANADLYPUCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANADLYPDCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CMDANADLYPUCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CMDANADLYPDCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANADLYPUCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANADLYPDCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANADLYPUCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANADLYPDCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CTLANADLYPUCTL), (BIT30), (BIT30));
>> +       /* DCOMP: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CTLANADLYPDCTL), (BIT30), (BIT30));
>> +       /* TCO: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANATCOPUCTL), (BIT30), (BIT30));
>> +       /* TCO: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQANATCOPDCTL), (BIT30), (BIT30));
>> +       /* TCO: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANATCOPUCTL), (BIT30), (BIT30));
>> +       /* TCO: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (CLKANATCOPDCTL), (BIT30), (BIT30));
>> +       /* TCO: Dither PU Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANATCOPUCTL), (BIT30), (BIT30));
>> +       /* TCO: Dither PD Enable */
>> +       mrc_alt_write_mask(DDRPHY, (DQSANATCOPDCTL), (BIT30), (BIT30));
>> +       /* TCOCOMP: Pulse Count */
>> +       mrc_alt_write_mask(DDRPHY, (TCOCNTCTRL), (0x1<<0), (BIT1|BIT0));
>> +       /* ODT: CMD/CTL PD/PU */
>> +       mrc_alt_write_mask(DDRPHY,
>> +               (CHNLBUFSTATIC), ((0x03<<24)|(0x03<<16)),
>> +               (BIT28 | BIT27 | BIT26 | BIT25 | BIT24 |
>> +               BIT20 | BIT19 | BIT18 | BIT17 | BIT16));
>> +       /* Set 1us counter */
>> +       mrc_alt_write_mask(DDRPHY,
>> +               (MSCNTR), (0x64 << 0),
>> +               (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0));
>> +       mrc_alt_write_mask(DDRPHY,
>> +               (LATCH1CTL), (0x1 << 28),
>> +               (BIT30 | BIT29 | BIT28));
>> +
>> +       /* Release PHY from reset */
>> +       mrc_alt_write_mask(DDRPHY, MASTERRSTN, BIT0, BIT0);
>> +
>> +       /* STEP1 */
>> +       mrc_post_code(0x03, 0x11);
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* DQ01-DQ23 */
>> +                       for (bl_grp = 0;
>> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
>> +                            bl_grp++) {
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQMDLLCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (BIT13),
>> +                                       (BIT13));       /* Enable VREG */
>> +                               delay_n(3);
>> +                       }
>> +
>> +                       /* ECC */
>> +                       mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL),
>> +                               (BIT13), (BIT13));      /* Enable VREG */
>> +                       delay_n(3);
>> +                       /* CMD */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               (BIT13), (BIT13));      /* Enable VREG */
>> +                       delay_n(3);
>> +                       /* CLK-CTL */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               (BIT13), (BIT13));      /* Enable VREG */
>> +                       delay_n(3);
>> +               }
>> +       }
>> +
>> +       /* STEP2 */
>> +       mrc_post_code(0x03, 0x12);
>> +       delay_n(200);
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* DQ01-DQ23 */
>> +                       for (bl_grp = 0;
>> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
>> +                            bl_grp++) {
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQMDLLCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (BIT17),
>> +                                       (BIT17));       /* Enable MCDLL */
>> +                               delay_n(50);
>> +                       }
>> +
>> +               /* ECC */
>> +               mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL),
>> +                       (BIT17), (BIT17));      /* Enable MCDLL */
>> +               delay_n(50);
>> +               /* CMD */
>> +               mrc_alt_write_mask(DDRPHY,
>> +                       (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                       (BIT18), (BIT18));      /* Enable MCDLL */
>> +               delay_n(50);
>> +               /* CLK-CTL */
>> +               mrc_alt_write_mask(DDRPHY,
>> +                       (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                       (BIT18), (BIT18));      /* Enable MCDLL */
>> +               delay_n(50);
>> +               }
>> +       }
>> +
>> +       /* STEP3: */
>> +       mrc_post_code(0x03, 0x13);
>> +       delay_n(100);
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* DQ01-DQ23 */
>> +                       for (bl_grp = 0;
>> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
>> +                            bl_grp++) {
>> +#ifdef FORCE_16BIT_DDRIO
>> +                               temp = ((bl_grp) &&
>> +                                       (mrc_params->channel_width == X16)) ?
>> +                                       ((0x1 << 12) | (0x1 << 8) |
>> +                                       (0xF << 4) | (0xF << 0)) :
>> +                                       ((0xF << 12) | (0xF << 8) |
>> +                                       (0xF << 4) | (0xF << 0));
>> +#else
>> +                               temp = ((0xF << 12) | (0xF << 8) |
>> +                                       (0xF << 4) | (0xF << 0));
>> +#endif
>> +                               /* Enable TXDLL */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQDLLTXCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       temp, 0xFFFF);
>> +                               delay_n(3);
>> +                               /* Enable RXDLL */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQDLLRXCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (BIT3 | BIT2 | BIT1 | BIT0),
>> +                                       (BIT3 | BIT2 | BIT1 | BIT0));
>> +                               delay_n(3);
>> +                               /* Enable RXDLL Overrides BL0 */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0OVRCTL +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (BIT3 | BIT2 | BIT1 | BIT0),
>> +                                       (BIT3 | BIT2 | BIT1 | BIT0));
>> +                       }
>> +
>> +                       /* ECC */
>> +                       temp = ((0xF << 12) | (0xF << 8) |
>> +                               (0xF << 4) | (0xF << 0));
>> +                       mrc_alt_write_mask(DDRPHY, (ECCDLLTXCTL),
>> +                               temp, 0xFFFF);
>> +                       delay_n(3);
>> +
>> +                       /* CMD (PO) */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDDLLTXCTL + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               temp, 0xFFFF);
>> +                       delay_n(3);
>> +               }
>> +       }
>> +
>> +       /* STEP4 */
>> +       mrc_post_code(0x03, 0x14);
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* Host To Memory Clock Alignment (HMC) for 800/1066 */
>> +                       for (bl_grp = 0;
>> +                            bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2);
>> +                            bl_grp++) {
>> +                               /* CLK_ALIGN_MOD_ID */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (DQCLKALIGNREG2 +
>> +                                       (bl_grp * DDRIODQ_BL_OFFSET) +
>> +                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                       (bl_grp) ? (0x3) : (0x1),
>> +                                       (BIT3 | BIT2 | BIT1 | BIT0));
>> +                       }
>> +
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (ECCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
>> +                               0x2,
>> +                               (BIT3 | BIT2 | BIT1 | BIT0));
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
>> +                               0x0,
>> +                               (BIT3 | BIT2 | BIT1 | BIT0));
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)),
>> +                               0x2,
>> +                               (BIT3 | BIT2 | BIT1 | BIT0));
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               (0x2 << 4), (BIT5 | BIT4));
>> +                       /*
>> +                        * NUM_SAMPLES, MAX_SAMPLES,
>> +                        * MACRO_PI_STEP, MICRO_PI_STEP
>> +                        */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCLKALIGNREG1 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x18 << 16) | (0x10 << 8) |
>> +                               (0x8 << 2) | (0x1 << 0)),
>> +                               (BIT22 | BIT21 | BIT20 | BIT19 | BIT18 | BIT17 |
>> +                               BIT16 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 |
>> +                               BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 |
>> +                               BIT2 | BIT1 | BIT0));
>> +                       /* TOTAL_NUM_MODULES, FIRST_U_PARTITION */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCLKALIGNREG2 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               ((0x10 << 16) | (0x4 << 8) | (0x2 << 4)),
>> +                               (BIT20 | BIT19 | BIT18 | BIT17 | BIT16 |
>> +                               BIT11 | BIT10 | BIT9 | BIT8 | BIT7 | BIT6 |
>> +                               BIT5 | BIT4));
>> +#ifdef HMC_TEST
>> +                       /* START_CLK_ALIGN=1 */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               BIT24, BIT24);
>> +                       while (msg_port_alt_read(DDRPHY,
>> +                               (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET))) &
>> +                               BIT24)
>> +                               ;       /* wait for START_CLK_ALIGN=0 */
>> +#endif
>> +
>> +                       /* Set RD/WR Pointer Seperation & COUNTEN & FIFOPTREN */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               BIT0, BIT0);    /* WRPTRENABLE=1 */
>> +
>> +                       /* COMP initial */
>> +                       /* enable bypass for CLK buffer (PO) */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               BIT5, BIT5);
>> +                       /* Initial COMP Enable */
>> +                       mrc_alt_write_mask(DDRPHY, (CMPCTRL),
>> +                               (BIT0), (BIT0));
>> +                       /* wait for Initial COMP Enable = 0 */
>> +                       while (msg_port_alt_read(DDRPHY, (CMPCTRL)) & BIT0)
>> +                               ;
>> +                       /* disable bypass for CLK buffer (PO) */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)),
>> +                               ~BIT5, BIT5);
>> +
>> +                       /* IOBUFACT */
>> +
>> +                       /* STEP4a */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               BIT2, BIT2);    /* IOBUFACTRST_N=1 */
>> +
>> +                       /* DDRPHY initialisation complete */
>> +                       mrc_alt_write_mask(DDRPHY,
>> +                               (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)),
>> +                               BIT20, BIT20);  /* SPID_INIT_COMPLETE=1 */
>> +               }
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/* This function performs JEDEC initialisation on all enabled channels */
>> +void perform_jedec_init(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t twr, wl, rank;
>> +       uint32_t tck;
>> +       u32 dtr0;
>> +       u32 drp;
>> +       u32 drmc;
>> +       u32 mrs0_cmd = 0;
>> +       u32 emrs1_cmd = 0;
>> +       u32 emrs2_cmd = 0;
>> +       u32 emrs3_cmd = 0;
>> +
>> +       ENTERFN();
>> +
>> +       /* jedec_init starts */
>> +       mrc_post_code(0x04, 0x00);
>> +
>> +       /* DDR3_RESET_SET=0, DDR3_RESET_RESET=1 */
>> +       mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT1, (BIT8 | BIT1));
>> +
>> +       /* Assert RESET# for 200us */
>> +       delay_u(200);
>> +
>> +       /* DDR3_RESET_SET=1, DDR3_RESET_RESET=0 */
>> +       mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT8, (BIT8 | BIT1));
>> +
>> +       dtr0 = msg_port_read(MEM_CTLR, DTR0);
>> +
>> +       /*
>> +        * Set CKEVAL for populated ranks
>> +        * then send NOP to each rank (#4550197)
>> +        */
>> +
>> +       drp = msg_port_read(MEM_CTLR, DRP);
>> +       drp &= 0x3;
>> +
>> +       drmc = msg_port_read(MEM_CTLR, DRMC);
>> +       drmc &= 0xFFFFFFFC;
>> +       drmc |= (BIT4 | drp);
>> +
>> +       msg_port_write(MEM_CTLR, DRMC, drmc);
>> +
>> +       for (rank = 0; rank < NUM_RANKS; rank++) {
>> +               /* Skip to next populated rank */
>> +               if ((mrc_params->rank_enables & (1 << rank)) == 0)
>> +                       continue;
>> +
>> +               dram_init_command(DCMD_NOP(rank));
>> +       }
>> +
>> +       msg_port_write(MEM_CTLR, DRMC,
>> +               (mrc_params->rd_odt_value == 0 ? BIT12 : 0));
>> +
>> +       /*
>> +        * setup for emrs 2
>> +        * BIT[15:11] --> Always "0"
>> +        * BIT[10:09] --> Rtt_WR: want "Dynamic ODT Off" (0)
>> +        * BIT[08]    --> Always "0"
>> +        * BIT[07]    --> SRT: use sr_temp_range
>> +        * BIT[06]    --> ASR: want "Manual SR Reference" (0)
>> +        * BIT[05:03] --> CWL: use oem_tCWL
>> +        * BIT[02:00] --> PASR: want "Full Array" (0)
>> +        */
>> +       emrs2_cmd |= (2 << 3);
>> +       wl = 5 + mrc_params->ddr_speed;
>> +       emrs2_cmd |= ((wl - 5) << 9);
>> +       emrs2_cmd |= (mrc_params->sr_temp_range << 13);
>> +
>> +       /*
>> +        * setup for emrs 3
>> +        * BIT[15:03] --> Always "0"
>> +        * BIT[02]    --> MPR: want "Normal Operation" (0)
>> +        * BIT[01:00] --> MPR_Loc: want "Predefined Pattern" (0)
>> +        */
>> +       emrs3_cmd |= (3 << 3);
>> +
>> +       /*
>> +        * setup for emrs 1
>> +        * BIT[15:13]     --> Always "0"
>> +        * BIT[12:12]     --> Qoff: want "Output Buffer Enabled" (0)
>> +        * BIT[11:11]     --> TDQS: want "Disabled" (0)
>> +        * BIT[10:10]     --> Always "0"
>> +        * BIT[09,06,02]  --> Rtt_nom: use rtt_nom_value
>> +        * BIT[08]        --> Always "0"
>> +        * BIT[07]        --> WR_LVL: want "Disabled" (0)
>> +        * BIT[05,01]     --> DIC: use ron_value
>> +        * BIT[04:03]     --> AL: additive latency want "0" (0)
>> +        * BIT[00]        --> DLL: want "Enable" (0)
>> +        *
>> +        * (BIT5|BIT1) set Ron value
>> +        * 00 --> RZQ/6 (40ohm)
>> +        * 01 --> RZQ/7 (34ohm)
>> +        * 1* --> RESERVED
>> +        *
>> +        * (BIT9|BIT6|BIT2) set Rtt_nom value
>> +        * 000 --> Disabled
>> +        * 001 --> RZQ/4 ( 60ohm)
>> +        * 010 --> RZQ/2 (120ohm)
>> +        * 011 --> RZQ/6 ( 40ohm)
>> +        * 1** --> RESERVED
>> +        */
>
> Why oh why not just have #defines for these? It seems like the
> original author knew they should be created but never made the step of
> actually doing it.
>

Again, undocumented in the Intel datasheet. I suspect it is something
in the JEDEC DDR spec though. Did not change this in v2.

>> +       emrs1_cmd |= (1 << 3);
>> +       emrs1_cmd &= ~BIT6;
>> +
>> +       if (mrc_params->ron_value == 0)
>> +               emrs1_cmd |= BIT7;
>> +       else
>> +               emrs1_cmd &= ~BIT7;
>> +
>> +       if (mrc_params->rtt_nom_value == 0)
>> +               emrs1_cmd |= (DDR3_EMRS1_RTTNOM_40 << 6);
>> +       else if (mrc_params->rtt_nom_value == 1)
>> +               emrs1_cmd |= (DDR3_EMRS1_RTTNOM_60 << 6);
>> +       else if (mrc_params->rtt_nom_value == 2)
>> +               emrs1_cmd |= (DDR3_EMRS1_RTTNOM_120 << 6);
>> +
>> +       /* save MRS1 value (excluding control fields) */
>> +       mrc_params->mrs1 = emrs1_cmd >> 6;
>> +
>> +       /*
>> +        * setup for mrs 0
>> +        * BIT[15:13]     --> Always "0"
>> +        * BIT[12]        --> PPD: for Quark (1)
>> +        * BIT[11:09]     --> WR: use oem_tWR
>> +        * BIT[08]        --> DLL: want "Reset" (1, self clearing)
>> +        * BIT[07]        --> MODE: want "Normal" (0)
>> +        * BIT[06:04,02]  --> CL: use oem_tCAS
>> +        * BIT[03]        --> RD_BURST_TYPE: want "Interleave" (1)
>> +        * BIT[01:00]     --> BL: want "8 Fixed" (0)
>> +        * WR:
>> +        * 0 --> 16
>> +        * 1 --> 5
>> +        * 2 --> 6
>> +        * 3 --> 7
>> +        * 4 --> 8
>> +        * 5 --> 10
>> +        * 6 --> 12
>> +        * 7 --> 14
>> +        * CL:
>> +        * BIT[02:02] "0" if oem_tCAS <= 11 (1866?)
>> +        * BIT[06:04] use oem_tCAS-4
>> +        */
>> +       mrs0_cmd |= BIT14;
>> +       mrs0_cmd |= BIT18;
>> +       mrs0_cmd |= ((((dtr0 >> 12) & 7) + 1) << 10);
>> +
>> +       tck = t_ck[mrc_params->ddr_speed];
>> +       /* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */
>> +       twr = MCEIL(15000, tck);
>> +       mrs0_cmd |= ((twr - 4) << 15);
>> +
>> +       for (rank = 0; rank < NUM_RANKS; rank++) {
>> +               /* Skip to next populated rank */
>> +               if ((mrc_params->rank_enables & (1 << rank)) == 0)
>> +                       continue;
>> +
>> +               emrs2_cmd |= (rank << 22);
>> +               dram_init_command(emrs2_cmd);
>> +
>> +               emrs3_cmd |= (rank << 22);
>> +               dram_init_command(emrs3_cmd);
>> +
>> +               emrs1_cmd |= (rank << 22);
>> +               dram_init_command(emrs1_cmd);
>> +
>> +               mrs0_cmd |= (rank << 22);
>> +               dram_init_command(mrs0_cmd);
>> +
>> +               dram_init_command(DCMD_ZQCL(rank));
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Dunit Initialisation Complete
>> + *
>> + * Indicates that initialisation of the Dunit has completed.
>> + *
>> + * Memory accesses are permitted and maintenance operation begins.
>> + * Until this bit is set to a 1, the memory controller will not accept
>> + * DRAM requests from the MEMORY_MANAGER or HTE.
>> + */
>> +void set_ddr_init_complete(struct mrc_params *mrc_params)
>> +{
>> +       u32 dco;
>> +
>> +       ENTERFN();
>> +
>> +       dco = msg_port_read(MEM_CTLR, DCO);
>> +       dco &= ~BIT28;
>> +       dco |= BIT31;
>> +       msg_port_write(MEM_CTLR, DCO, dco);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will retrieve relevant timing data
>> + *
>> + * This data will be used on subsequent boots to speed up boot times
>> + * and is required for Suspend To RAM capabilities.
>> + */
>> +void restore_timings(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch, rk, bl;
>> +       const struct mrc_timings *mt = &mrc_params->timings;
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                       for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
>> +                               set_rcvn(ch, rk, bl, mt->rcvn[ch][rk][bl]);
>> +                               set_rdqs(ch, rk, bl, mt->rdqs[ch][rk][bl]);
>> +                               set_wdqs(ch, rk, bl, mt->wdqs[ch][rk][bl]);
>> +                               set_wdq(ch, rk, bl, mt->wdq[ch][rk][bl]);
>> +                               if (rk == 0) {
>> +                                       /* VREF (RANK0 only) */
>> +                                       set_vref(ch, bl, mt->vref[ch][bl]);
>> +                               }
>> +                       }
>> +                       set_wctl(ch, rk, mt->wctl[ch][rk]);
>> +               }
>> +               set_wcmd(ch, mt->wcmd[ch]);
>> +       }
>> +}
>> +
>> +/*
>> + * Configure default settings normally set as part of read training
>> + *
>> + * Some defaults have to be set earlier as they may affect earlier
>> + * training steps.
>> + */
>> +void default_timings(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch, rk, bl;
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                       for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
>> +                               set_rdqs(ch, rk, bl, 24);
>> +                               if (rk == 0) {
>> +                                       /* VREF (RANK0 only) */
>> +                                       set_vref(ch, bl, 32);
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +}
>> +
>> +/*
>> + * This function will perform our RCVEN Calibration Algorithm.
>> + * We will only use the 2xCLK domain timings to perform RCVEN Calibration.
>> + * All byte lanes will be calibrated "simultaneously" per channel per rank.
>> + */
>> +void rcvn_cal(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch;     /* channel counter */
>> +       uint8_t rk;     /* rank counter */
>> +       uint8_t bl;     /* byte lane counter */
>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +
>> +#ifdef R2R_SHARING
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
>> +#ifndef BACKUP_RCVN
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t num_ranks_enabled = 0;
>> +#endif
>> +#endif
>> +
>> +#ifdef BACKUP_RCVN
>> +#else
>> +       uint32_t temp;
>> +       /* absolute PI value to be programmed on the byte lane */
>> +       uint32_t delay[NUM_BYTE_LANES];
>> +       u32 dtr1, dtr1_save;
>> +#endif
>> +
>> +       ENTERFN();
>> +
>> +       /* rcvn_cal starts */
>> +       mrc_post_code(0x05, 0x00);
>> +
>> +#ifndef BACKUP_RCVN
>> +       /* need separate burst to sample DQS preamble */
>> +       dtr1 = msg_port_read(MEM_CTLR, DTR1);
>> +       dtr1_save = dtr1;
>> +       dtr1 |= BIT12;
>> +       msg_port_write(MEM_CTLR, DTR1, dtr1);
>> +#endif
>> +
>> +#ifdef R2R_SHARING
>> +       /* need to set "final_delay[][]" elements to "0" */
>> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
>> +#endif
>> +
>> +       /* loop through each enabled channel */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* perform RCVEN Calibration on a per rank basis */
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       /*
>> +                                        * POST_CODE here indicates the current
>> +                                        * channel and rank being calibrated
>> +                                        */
>> +                                       mrc_post_code(0x05, (0x10 + ((ch << 4) | rk)));
>> +
>> +#ifdef BACKUP_RCVN
>> +                                       /* et hard-coded timing values */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++)
>> +                                               set_rcvn(ch, rk, bl, ddr_rcvn[PLATFORM_ID]);
>> +#else
>> +                                       /* enable FIFORST */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) {
>> +                                               mrc_alt_write_mask(DDRPHY,
>> +                                                       (B01PTRCTL1 +
>> +                                                       ((bl >> 1) * DDRIODQ_BL_OFFSET) +
>> +                                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                                       0, BIT8);
>> +                                       }
>> +                                       /* initialize the starting delay to 128 PI (cas +1 CLK) */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               /* 1x CLK domain timing is cas-4 */
>> +                                               delay[bl] = (4 + 1) * FULL_CLK;
>> +
>> +                                               set_rcvn(ch, rk, bl, delay[bl]);
>> +                                       }
>> +
>> +                                       /* now find the rising edge */
>> +                                       find_rising_edge(mrc_params, delay, ch, rk, true);
>> +
>> +                                       /* Now increase delay by 32 PI (1/4 CLK) to place in center of high pulse */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               delay[bl] += QRTR_CLK;
>> +                                               set_rcvn(ch, rk, bl, delay[bl]);
>> +                                       }
>> +                                       /* Now decrement delay by 128 PI (1 CLK) until we sample a "0" */
>> +                                       do {
>> +                                               temp = sample_dqs(mrc_params, ch, rk, true);
>> +                                               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                                       if (temp & (1 << bl)) {
>> +                                                               if (delay[bl] >= FULL_CLK) {
>> +                                                                       delay[bl] -= FULL_CLK;
>> +                                                                       set_rcvn(ch, rk, bl, delay[bl]);
>> +                                                               } else {
>> +                                                                       /* not enough delay */
>> +                                                                       training_message(ch, rk, bl);
>> +                                                                       mrc_post_code(0xEE, 0x50);
>> +                                                               }
>> +                                                       }
>> +                                               }
>> +                                       } while (temp & 0xFF);
>> +
>> +#ifdef R2R_SHARING
>> +                                       /* increment "num_ranks_enabled" */
>> +                                       num_ranks_enabled++;
>> +                                       /* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               delay[bl] += QRTR_CLK;
>> +                                               /* add "delay[]" values to "final_delay[][]" for rolling average */
>> +                                               final_delay[ch][bl] += delay[bl];
>> +                                               /* set timing based on rolling average values */
>> +                                               set_rcvn(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
>> +                                       }
>> +#else
>> +                                       /* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               delay[bl] += QRTR_CLK;
>> +                                               set_rcvn(ch, rk, bl, delay[bl]);
>> +                                       }
>> +#endif
>> +
>> +                                       /* disable FIFORST */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) {
>> +                                               mrc_alt_write_mask(DDRPHY,
>> +                                                       (B01PTRCTL1 +
>> +                                                       ((bl >> 1) * DDRIODQ_BL_OFFSET) +
>> +                                                       (ch * DDRIODQ_CH_OFFSET)),
>> +                                                       BIT8, BIT8);
>> +                                       }
>> +#endif
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +#ifndef BACKUP_RCVN
>> +       /* restore original */
>> +       msg_port_write(MEM_CTLR, DTR1, dtr1_save);
>> +#endif
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will perform the Write Levelling algorithm
>> + * (align WCLK and WDQS).
>> + *
>> + * This algorithm will act on each rank in each channel separately.
>> + */
>> +void wr_level(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch;     /* channel counter */
>> +       uint8_t rk;     /* rank counter */
>> +       uint8_t bl;     /* byte lane counter */
>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +
>> +#ifdef R2R_SHARING
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
>> +#ifndef BACKUP_WDQS
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t num_ranks_enabled = 0;
>> +#endif
>> +#endif
>> +
>> +#ifdef BACKUP_WDQS
>> +#else
>> +       /* determines stop condition for CRS_WR_LVL */
>> +       bool all_edges_found;
>> +       /* absolute PI value to be programmed on the byte lane */
>> +       uint32_t delay[NUM_BYTE_LANES];
>> +       /*
>> +        * static makes it so the data is loaded in the heap once by shadow(),
>> +        * where non-static copies the data onto the stack every time this
>> +        * function is called
>> +        */
>> +       uint32_t address;       /* address to be checked during COARSE_WR_LVL */
>> +       u32 dtr4, dtr4_save;
>> +#endif
>> +
>> +       ENTERFN();
>> +
>> +       /* wr_level starts */
>> +       mrc_post_code(0x06, 0x00);
>> +
>> +#ifdef R2R_SHARING
>> +       /* need to set "final_delay[][]" elements to "0" */
>> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
>> +#endif
>> +
>> +       /* loop through each enabled channel */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       /* perform WRITE LEVELING algorithm on a per rank basis */
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       /*
>> +                                        * POST_CODE here indicates the current
>> +                                        * rank and channel being calibrated
>> +                                        */
>> +                                       mrc_post_code(0x06, (0x10 + ((ch << 4) | rk)));
>> +
>> +#ifdef BACKUP_WDQS
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               set_wdqs(ch, rk, bl, ddr_wdqs[PLATFORM_ID]);
>> +                                               set_wdq(ch, rk, bl, (ddr_wdqs[PLATFORM_ID] - QRTR_CLK));
>> +                                       }
>> +#else
>> +                                       /*
>> +                                        * perform a single PRECHARGE_ALL command to
>> +                                        * make DRAM state machine go to IDLE state
>> +                                        */
>> +                                       dram_init_command(DCMD_PREA(rk));
>> +
>> +                                       /*
>> +                                        * enable Write Levelling Mode
>> +                                        * (EMRS1 w/ Write Levelling Mode Enable)
>> +                                        */
>> +                                       dram_init_command(DCMD_MRS1(rk, 0x0082));
>> +
>> +                                       /*
>> +                                        * set ODT DRAM Full Time Termination
>> +                                        * disable in MCU
>> +                                        */
>> +
>> +                                       dtr4 = msg_port_read(MEM_CTLR, DTR4);
>> +                                       dtr4_save = dtr4;
>> +                                       dtr4 |= BIT15;
>> +                                       msg_port_write(MEM_CTLR, DTR4, dtr4);
>> +
>> +                                       for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
>> +                                               /*
>> +                                                * Enable Sandy Bridge Mode (WDQ Tri-State) &
>> +                                                * Ensure 5 WDQS pulses during Write Leveling
>> +                                                */
>> +                                               mrc_alt_write_mask(DDRPHY,
>> +                                                       DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch),
>> +                                                       (BIT28 | BIT8 | BIT6 | BIT4 | BIT2),
>> +                                                       (BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
>> +                                       }
>> +
>> +                                       /* Write Leveling Mode enabled in IO */
>> +                                       mrc_alt_write_mask(DDRPHY,
>> +                                               CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch),
>> +                                               BIT16, BIT16);
>> +
>> +                                       /* Initialize the starting delay to WCLK */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               /*
>> +                                                * CLK0 --> RK0
>> +                                                * CLK1 --> RK1
>> +                                                */
>> +                                               delay[bl] = get_wclk(ch, rk);
>> +
>> +                                               set_wdqs(ch, rk, bl, delay[bl]);
>> +                                       }
>> +
>> +                                       /* now find the rising edge */
>> +                                       find_rising_edge(mrc_params, delay, ch, rk, false);
>> +
>> +                                       /* disable Write Levelling Mode */
>> +                                       mrc_alt_write_mask(DDRPHY,
>> +                                               CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch),
>> +                                               0, BIT16);
>> +
>> +                                       for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
>> +                                               /* Disable Sandy Bridge Mode & Ensure 4 WDQS pulses during normal operation */
>> +                                               mrc_alt_write_mask(DDRPHY,
>> +                                                       DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch),
>> +                                                       (BIT8 | BIT6 | BIT4 | BIT2),
>> +                                                       (BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2));
>> +                                       }
>> +
>> +                                       /* restore original DTR4 */
>> +                                       msg_port_write(MEM_CTLR, DTR4, dtr4_save);
>> +
>> +                                       /*
>> +                                        * restore original value
>> +                                        * (Write Levelling Mode Disable)
>> +                                        */
>> +                                       dram_init_command(DCMD_MRS1(rk, mrc_params->mrs1));
>> +
>> +                                       /*
>> +                                        * perform a single PRECHARGE_ALL command to
>> +                                        * make DRAM state machine go to IDLE state
>> +                                        */
>> +                                       dram_init_command(DCMD_PREA(rk));
>> +
>> +                                       mrc_post_code(0x06, (0x30 + ((ch << 4) | rk)));
>> +
>> +                                       /*
>> +                                        * COARSE WRITE LEVEL:
>> +                                        * check that we're on the correct clock edge
>> +                                        */
>> +
>> +                                       /* hte reconfiguration request */
>> +                                       mrc_params->hte_setup = 1;
>> +
>> +                                       /* start CRS_WR_LVL with WDQS = WDQS + 128 PI */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               delay[bl] = get_wdqs(ch, rk, bl) + FULL_CLK;
>> +                                               set_wdqs(ch, rk, bl, delay[bl]);
>> +                                               /*
>> +                                                * program WDQ timings based on WDQS
>> +                                                * (WDQ = WDQS - 32 PI)
>> +                                                */
>> +                                               set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK));
>> +                                       }
>> +
>> +                                       /* get an address in the targeted channel/rank */
>> +                                       address = get_addr(ch, rk);
>> +                                       do {
>> +                                               uint32_t coarse_result = 0x00;
>> +                                               uint32_t coarse_result_mask = byte_lane_mask(mrc_params);
>> +                                               /* assume pass */
>> +                                               all_edges_found = true;
>> +
>> +                                               mrc_params->hte_setup = 1;
>> +                                               coarse_result = check_rw_coarse(mrc_params, address);
>> +
>> +                                               /* check for failures and margin the byte lane back 128 PI (1 CLK) */
>> +                                               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                                       if (coarse_result & (coarse_result_mask << bl)) {
>> +                                                               all_edges_found = false;
>> +                                                               delay[bl] -= FULL_CLK;
>> +                                                               set_wdqs(ch, rk, bl, delay[bl]);
>> +                                                               /* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */
>> +                                                               set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK));
>> +                                                       }
>> +                                               }
>> +                                       } while (!all_edges_found);
>> +
>> +#ifdef R2R_SHARING
>> +                                       /* increment "num_ranks_enabled" */
>> +                                        num_ranks_enabled++;
>> +                                       /* accumulate "final_delay[][]" values from "delay[]" values for rolling average */
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               final_delay[ch][bl] += delay[bl];
>> +                                               set_wdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
>> +                                               /* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */
>> +                                               set_wdq(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled) - QRTR_CLK);
>> +                                       }
>> +#endif
>> +#endif
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +void prog_page_ctrl(struct mrc_params *mrc_params)
>> +{
>> +       u32 dpmc0;
>> +
>> +       ENTERFN();
>> +
>> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
>> +       dpmc0 &= ~(BIT16 | BIT17 | BIT18);
>> +       dpmc0 |= (4 << 16);
>> +       dpmc0 |= BIT21;
>> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
>> +}
>> +
>> +/*
>> + * This function will perform the READ TRAINING Algorithm on all
>> + * channels/ranks/byte_lanes simultaneously to minimize execution time.
>> + *
>> + * The idea here is to train the VREF and RDQS (and eventually RDQ) values
>> + * to achieve maximum READ margins. The algorithm will first determine the
>> + * X coordinate (RDQS setting). This is done by collapsing the VREF eye
>> + * until we find a minimum required RDQS eye for VREF_MIN and VREF_MAX.
>> + * Then we take the averages of the RDQS eye at VREF_MIN and VREF_MAX,
>> + * then average those; this will be the final X coordinate. The algorithm
>> + * will then determine the Y coordinate (VREF setting). This is done by
>> + * collapsing the RDQS eye until we find a minimum required VREF eye for
>> + * RDQS_MIN and RDQS_MAX. Then we take the averages of the VREF eye at
>> + * RDQS_MIN and RDQS_MAX, then average those; this will be the final Y
>> + * coordinate.
>> + *
>> + * NOTE: this algorithm assumes the eye curves have a one-to-one relationship,
>> + * meaning for each X the curve has only one Y and vice-a-versa.
>> + */
>> +void rd_train(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch;     /* channel counter */
>> +       uint8_t rk;     /* rank counter */
>> +       uint8_t bl;     /* byte lane counter */
>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +#ifdef BACKUP_RDQS
>> +#else
>> +       uint8_t side_x; /* tracks LEFT/RIGHT approach vectors */
>> +       uint8_t side_y; /* tracks BOTTOM/TOP approach vectors */
>> +       /* X coordinate data (passing RDQS values) for approach vectors */
>> +       uint8_t x_coordinate[2][2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       /* Y coordinate data (passing VREF values) for approach vectors */
>> +       uint8_t y_coordinate[2][2][NUM_CHANNELS][NUM_BYTE_LANES];
>> +       /* centered X (RDQS) */
>> +       uint8_t x_center[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       /* centered Y (VREF) */
>> +       uint8_t y_center[NUM_CHANNELS][NUM_BYTE_LANES];
>> +       uint32_t address;       /* target address for check_bls_ex() */
>> +       uint32_t result;        /* result of check_bls_ex() */
>> +       uint32_t bl_mask;       /* byte lane mask for result checking */
>> +#ifdef R2R_SHARING
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t num_ranks_enabled = 0;
>> +#endif
>> +#endif
>> +
>> +       /* rd_train starts */
>> +       mrc_post_code(0x07, 0x00);
>> +
>> +       ENTERFN();
>> +
>> +#ifdef BACKUP_RDQS
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       for (bl = 0;
>> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                            bl++) {
>> +                                               set_rdqs(ch, rk, bl, ddr_rdqs[PLATFORM_ID]);
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +#else
>> +       /* initialise x/y_coordinate arrays */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       for (bl = 0;
>> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                            bl++) {
>> +                                               /* x_coordinate */
>> +                                               x_coordinate[L][B][ch][rk][bl] = RDQS_MIN;
>> +                                               x_coordinate[R][B][ch][rk][bl] = RDQS_MAX;
>> +                                               x_coordinate[L][T][ch][rk][bl] = RDQS_MIN;
>> +                                               x_coordinate[R][T][ch][rk][bl] = RDQS_MAX;
>> +                                               /* y_coordinate */
>> +                                               y_coordinate[L][B][ch][bl] = VREF_MIN;
>> +                                               y_coordinate[R][B][ch][bl] = VREF_MIN;
>> +                                               y_coordinate[L][T][ch][bl] = VREF_MAX;
>> +                                               y_coordinate[R][T][ch][bl] = VREF_MAX;
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       /* initialize other variables */
>> +       bl_mask = byte_lane_mask(mrc_params);
>> +       address = get_addr(0, 0);
>> +
>> +#ifdef R2R_SHARING
>> +       /* need to set "final_delay[][]" elements to "0" */
>> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
>> +#endif
>> +
>> +       /* look for passing coordinates */
>> +       for (side_y = B; side_y <= T; side_y++) {
>> +               for (side_x = L; side_x <= R; side_x++) {
>> +                       mrc_post_code(0x07, (0x10 + (side_y * 2) + (side_x)));
>> +
>> +                       /* find passing values */
>> +                       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +                               if (mrc_params->channel_enables & (0x1 << ch)) {
>> +                                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                                               if (mrc_params->rank_enables &
>> +                                                       (0x1 << rk)) {
>> +                                                       /* set x/y_coordinate search starting settings */
>> +                                                       for (bl = 0;
>> +                                                            bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                                            bl++) {
>> +                                                               set_rdqs(ch, rk, bl,
>> +                                                                        x_coordinate[side_x][side_y][ch][rk][bl]);
>> +                                                               set_vref(ch, bl,
>> +                                                                        y_coordinate[side_x][side_y][ch][bl]);
>> +                                                       }
>> +
>> +                                                       /* get an address in the target channel/rank */
>> +                                                       address = get_addr(ch, rk);
>> +
>> +                                                       /* request HTE reconfiguration */
>> +                                                       mrc_params->hte_setup = 1;
>> +
>> +                                                       /* test the settings */
>> +                                                       do {
>> +                                                               /* result[07:00] == failing byte lane (MAX 8) */
>> +                                                               result = check_bls_ex(mrc_params, address);
>> +
>> +                                                               /* check for failures */
>> +                                                               if (result & 0xFF) {
>> +                                                                       /* at least 1 byte lane failed */
>
> I'm pretty sure this block can go in a function

And I don't know how to name that function with a proper name. Sigh
... Did not change in v2.

>> +                                                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                                                               if (result &
>> +                                                                                       (bl_mask << bl)) {
>> +                                                                                       /* adjust the RDQS values accordingly */
>> +                                                                                       if (side_x == L)
>> +                                                                                               x_coordinate[L][side_y][ch][rk][bl] += RDQS_STEP;
>> +                                                                                       else
>> +                                                                                               x_coordinate[R][side_y][ch][rk][bl] -= RDQS_STEP;
>> +
>> +                                                                                       /* check that we haven't closed the RDQS_EYE too much */
>> +                                                                                       if ((x_coordinate[L][side_y][ch][rk][bl] > (RDQS_MAX - MIN_RDQS_EYE)) ||
>> +                                                                                               (x_coordinate[R][side_y][ch][rk][bl] < (RDQS_MIN + MIN_RDQS_EYE)) ||
>> +                                                                                               (x_coordinate[L][side_y][ch][rk][bl] ==
>> +                                                                                               x_coordinate[R][side_y][ch][rk][bl])) {
>> +                                                                                               /*
>> +                                                                                                * not enough RDQS margin available at this VREF
>> +                                                                                                * update VREF values accordingly
>> +                                                                                                */
>> +                                                                                               if (side_y == B)
>> +                                                                                                       y_coordinate[side_x][B][ch][bl] += VREF_STEP;
>> +                                                                                               else
>> +                                                                                                       y_coordinate[side_x][T][ch][bl] -= VREF_STEP;
>> +
>> +                                                                                               /* check that we haven't closed the VREF_EYE too much */
>> +                                                                                               if ((y_coordinate[side_x][B][ch][bl] > (VREF_MAX - MIN_VREF_EYE)) ||
>> +                                                                                                       (y_coordinate[side_x][T][ch][bl] < (VREF_MIN + MIN_VREF_EYE)) ||
>> +                                                                                                       (y_coordinate[side_x][B][ch][bl] == y_coordinate[side_x][T][ch][bl])) {
>> +                                                                                                       /* VREF_EYE collapsed below MIN_VREF_EYE */
>> +                                                                                                       training_message(ch, rk, bl);
>> +                                                                                                       mrc_post_code(0xEE, (0x70 + (side_y * 2) + (side_x)));
>> +                                                                                               } else {
>> +                                                                                                       /* update the VREF setting */
>> +                                                                                                       set_vref(ch, bl, y_coordinate[side_x][side_y][ch][bl]);
>> +                                                                                                       /* reset the X coordinate to begin the search at the new VREF */
>> +                                                                                                       x_coordinate[side_x][side_y][ch][rk][bl] =
>> +                                                                                                               (side_x == L) ? (RDQS_MIN) : (RDQS_MAX);
>> +                                                                                               }
>> +                                                                                       }
>> +
>> +                                                                                       /* update the RDQS setting */
>> +                                                                                       set_rdqs(ch, rk, bl, x_coordinate[side_x][side_y][ch][rk][bl]);
>> +                                                                               }
>> +                                                                       }
>> +                                                               }
>> +                                                       } while (result & 0xFF);
>> +                                               }
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       mrc_post_code(0x07, 0x20);
>> +
>> +       /* find final RDQS (X coordinate) & final VREF (Y coordinate) */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               uint32_t temp1;
>> +                                               uint32_t temp2;
>> +
>> +                                               /* x_coordinate */
>> +                                               DPF(D_INFO,
>> +                                                   "RDQS T/B eye rank%d lane%d : %d-%d %d-%d\n",
>> +                                                   rk, bl,
>> +                                                   x_coordinate[L][T][ch][rk][bl],
>> +                                                   x_coordinate[R][T][ch][rk][bl],
>> +                                                   x_coordinate[L][B][ch][rk][bl],
>> +                                                   x_coordinate[R][B][ch][rk][bl]);
>> +
>> +                                               /* average the TOP side LEFT & RIGHT values */
>> +                                               temp1 = (x_coordinate[R][T][ch][rk][bl] + x_coordinate[L][T][ch][rk][bl]) / 2;
>> +                                               /* average the BOTTOM side LEFT & RIGHT values */
>> +                                               temp2 = (x_coordinate[R][B][ch][rk][bl] + x_coordinate[L][B][ch][rk][bl]) / 2;
>> +                                               /* average the above averages */
>> +                                               x_center[ch][rk][bl] = (uint8_t) ((temp1 + temp2) / 2);
>> +
>> +                                               /* y_coordinate */
>> +                                               DPF(D_INFO,
>> +                                                   "VREF R/L eye lane%d : %d-%d %d-%d\n",
>> +                                                   bl,
>> +                                                   y_coordinate[R][B][ch][bl],
>> +                                                   y_coordinate[R][T][ch][bl],
>> +                                                   y_coordinate[L][B][ch][bl],
>> +                                                   y_coordinate[L][T][ch][bl]);
>> +
>> +                                               /* average the RIGHT side TOP & BOTTOM values */
>> +                                               temp1 = (y_coordinate[R][T][ch][bl] + y_coordinate[R][B][ch][bl]) / 2;
>> +                                               /* average the LEFT side TOP & BOTTOM values */
>> +                                               temp2 = (y_coordinate[L][T][ch][bl] + y_coordinate[L][B][ch][bl]) / 2;
>> +                                               /* average the above averages */
>> +                                               y_center[ch][bl] = (uint8_t) ((temp1 + temp2) / 2);
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +#ifdef RX_EYE_CHECK
>> +       /* perform an eye check */
>> +       for (side_y = B; side_y <= T; side_y++) {
>> +               for (side_x = L; side_x <= R; side_x++) {
>> +                       mrc_post_code(0x07, (0x30 + (side_y * 2) + (side_x)));
>> +
>> +                       /* update the settings for the eye check */
>> +                       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +                               if (mrc_params->channel_enables & (1 << ch)) {
>> +                                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                                               if (side_x == L)
>> +                                                                       set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] - (MIN_RDQS_EYE / 2)));
>> +                                                               else
>> +                                                                       set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] + (MIN_RDQS_EYE / 2)));
>> +
>> +                                                               if (side_y == B)
>> +                                                                       set_vref(ch, bl, (y_center[ch][bl] - (MIN_VREF_EYE / 2)));
>> +                                                               else
>> +                                                                       set_vref(ch, bl, (y_center[ch][bl] + (MIN_VREF_EYE / 2)));
>> +                                                       }
>> +                                               }
>> +                                       }
>> +                               }
>> +                       }
>> +
>> +                       /* request HTE reconfiguration */
>> +                       mrc_params->hte_setup = 1;
>> +
>> +                       /* check the eye */
>> +                       if (check_bls_ex(mrc_params, address) & 0xFF) {
>> +                               /* one or more byte lanes failed */
>> +                               mrc_post_code(0xEE, (0x74 + (side_x * 2) + (side_y)));
>> +                       }
>> +               }
>> +       }
>> +#endif
>> +
>> +       mrc_post_code(0x07, 0x40);
>> +
>> +       /* set final placements */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +#ifdef R2R_SHARING
>> +                                       /* increment "num_ranks_enabled" */
>> +                                       num_ranks_enabled++;
>> +#endif
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               /* x_coordinate */
>> +#ifdef R2R_SHARING
>> +                                               final_delay[ch][bl] += x_center[ch][rk][bl];
>> +                                               set_rdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled));
>> +#else
>> +                                               set_rdqs(ch, rk, bl, x_center[ch][rk][bl]);
>> +#endif
>> +                                               /* y_coordinate */
>> +                                               set_vref(ch, bl, y_center[ch][bl]);
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +#endif
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will perform the WRITE TRAINING Algorithm on all
>> + * channels/ranks/byte_lanes simultaneously to minimize execution time.
>> + *
>> + * The idea here is to train the WDQ timings to achieve maximum WRITE margins.
>> + * The algorithm will start with WDQ at the current WDQ setting (tracks WDQS
>> + * in WR_LVL) +/- 32 PIs (+/- 1/4 CLK) and collapse the eye until all data
>> + * patterns pass. This is because WDQS will be aligned to WCLK by the
>> + * Write Leveling algorithm and WDQ will only ever have a 1/2 CLK window
>> + * of validity.
>> + */
>> +void wr_train(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch;     /* channel counter */
>> +       uint8_t rk;     /* rank counter */
>> +       uint8_t bl;     /* byte lane counter */
>> +       uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1;
>> +#ifdef BACKUP_WDQ
>> +#else
>> +       uint8_t side;           /* LEFT/RIGHT side indicator (0=L, 1=R) */
>> +       uint32_t temp;          /* temporary DWORD */
>> +       /* 2 arrays, for L & R side passing delays */
>> +       uint32_t delay[2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES];
>> +       uint32_t address;       /* target address for check_bls_ex() */
>> +       uint32_t result;        /* result of check_bls_ex() */
>> +       uint32_t bl_mask;       /* byte lane mask for result checking */
>> +#ifdef R2R_SHARING
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES];
>> +       /* used to find placement for rank2rank sharing configs */
>> +       uint32_t num_ranks_enabled = 0;
>> +#endif
>> +#endif
>> +
>> +       /* wr_train starts */
>> +       mrc_post_code(0x08, 0x00);
>> +
>> +       ENTERFN();
>> +
>> +#ifdef BACKUP_WDQ
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       for (bl = 0;
>> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                            bl++) {
>> +                                               set_wdq(ch, rk, bl, ddr_wdq[PLATFORM_ID]);
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +#else
>> +       /* initialise "delay" */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +                                       for (bl = 0;
>> +                                            bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                            bl++) {
>> +                                               /*
>> +                                                * want to start with
>> +                                                * WDQ = (WDQS - QRTR_CLK)
>> +                                                * +/- QRTR_CLK
>> +                                                */
>> +                                               temp = get_wdqs(ch, rk, bl) - QRTR_CLK;
>> +                                               delay[L][ch][rk][bl] = temp - QRTR_CLK;
>> +                                               delay[R][ch][rk][bl] = temp + QRTR_CLK;
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       /* initialise other variables */
>> +       bl_mask = byte_lane_mask(mrc_params);
>> +       address = get_addr(0, 0);
>> +
>> +#ifdef R2R_SHARING
>> +       /* need to set "final_delay[][]" elements to "0" */
>> +       memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay));
>> +#endif
>> +
>> +       /*
>> +        * start algorithm on the LEFT side and train each channel/bl
>> +        * until no failures are observed, then repeat for the RIGHT side.
>> +        */
>> +       for (side = L; side <= R; side++) {
>> +               mrc_post_code(0x08, (0x10 + (side)));
>> +
>> +               /* set starting values */
>> +               for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +                       if (mrc_params->channel_enables & (1 << ch)) {
>> +                               for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                                       if (mrc_params->rank_enables &
>> +                                               (1 << rk)) {
>> +                                               for (bl = 0;
>> +                                                    bl < (NUM_BYTE_LANES / bl_divisor);
>> +                                                    bl++) {
>> +                                                       set_wdq(ch, rk, bl, delay[side][ch][rk][bl]);
>> +                                               }
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +
>> +               /* find passing values */
>> +               for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +                       if (mrc_params->channel_enables & (1 << ch)) {
>> +                               for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                                       if (mrc_params->rank_enables &
>> +                                               (1 << rk)) {
>> +                                               /* get an address in the target channel/rank */
>> +                                               address = get_addr(ch, rk);
>> +
>> +                                               /* request HTE reconfiguration */
>> +                                               mrc_params->hte_setup = 1;
>> +
>> +                                               /* check the settings */
>> +                                               do {
>> +                                                       /* result[07:00] == failing byte lane (MAX 8) */
>> +                                                       result = check_bls_ex(mrc_params, address);
>> +                                                       /* check for failures */
>> +                                                       if (result & 0xFF) {
>> +                                                               /* at least 1 byte lane failed */
>> +                                                               for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                                                       if (result &
>> +                                                                               (bl_mask << bl)) {
>> +                                                                               if (side == L)
>> +                                                                                       delay[L][ch][rk][bl] += WDQ_STEP;
>> +                                                                               else
>> +                                                                                       delay[R][ch][rk][bl] -= WDQ_STEP;
>> +
>> +                                                                               /* check for algorithm failure */
>> +                                                                               if (delay[L][ch][rk][bl] != delay[R][ch][rk][bl]) {
>> +                                                                                       /*
>> +                                                                                        * margin available
>> +                                                                                        * update delay setting
>> +                                                                                        */
>> +                                                                                       set_wdq(ch, rk, bl,
>> +                                                                                               delay[side][ch][rk][bl]);
>> +                                                                               } else {
>> +                                                                                       /*
>> +                                                                                        * no margin available
>> +                                                                                        * notify the user and halt
>> +                                                                                        */
>> +                                                                                       training_message(ch, rk, bl);
>> +                                                                                       mrc_post_code(0xEE, (0x80 + side));
>> +                                                                               }
>> +                                                                       }
>> +                                                               }
>> +                                                       }
>> +                                               /* stop when all byte lanes pass */
>> +                                               } while (result & 0xFF);
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +
>> +       /* program WDQ to the middle of passing window */
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               if (mrc_params->channel_enables & (1 << ch)) {
>> +                       for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                               if (mrc_params->rank_enables & (1 << rk)) {
>> +#ifdef R2R_SHARING
>> +                                       /* increment "num_ranks_enabled" */
>> +                                       num_ranks_enabled++;
>> +#endif
>> +                                       for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) {
>> +                                               DPF(D_INFO,
>> +                                                   "WDQ eye rank%d lane%d : %d-%d\n",
>> +                                                   rk, bl,
>> +                                                   delay[L][ch][rk][bl],
>> +                                                   delay[R][ch][rk][bl]);
>> +
>> +                                               temp = (delay[R][ch][rk][bl] + delay[L][ch][rk][bl]) / 2;
>> +
>> +#ifdef R2R_SHARING
>> +                                               final_delay[ch][bl] += temp;
>> +                                               set_wdq(ch, rk, bl,
>> +                                                       ((final_delay[ch][bl]) / num_ranks_enabled));
>> +#else
>> +                                               set_wdq(ch, rk, bl, temp);
>> +#endif
>> +                                       }
>> +                               }
>> +                       }
>> +               }
>> +       }
>> +#endif
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * This function will store relevant timing data
>> + *
>> + * This data will be used on subsequent boots to speed up boot times
>> + * and is required for Suspend To RAM capabilities.
>> + */
>> +void store_timings(struct mrc_params *mrc_params)
>> +{
>> +       uint8_t ch, rk, bl;
>> +       struct mrc_timings *mt = &mrc_params->timings;
>> +
>> +       for (ch = 0; ch < NUM_CHANNELS; ch++) {
>> +               for (rk = 0; rk < NUM_RANKS; rk++) {
>> +                       for (bl = 0; bl < NUM_BYTE_LANES; bl++) {
>> +                               mt->rcvn[ch][rk][bl] = get_rcvn(ch, rk, bl);
>> +                               mt->rdqs[ch][rk][bl] = get_rdqs(ch, rk, bl);
>> +                               mt->wdqs[ch][rk][bl] = get_wdqs(ch, rk, bl);
>> +                               mt->wdq[ch][rk][bl] = get_wdq(ch, rk, bl);
>> +
>> +                               if (rk == 0)
>> +                                       mt->vref[ch][bl] = get_vref(ch, bl);
>> +                       }
>> +
>> +                       mt->wctl[ch][rk] = get_wctl(ch, rk);
>> +               }
>> +
>> +               mt->wcmd[ch] = get_wcmd(ch);
>> +       }
>> +
>> +       /* need to save for a case of changing frequency after warm reset */
>> +       mt->ddr_speed = mrc_params->ddr_speed;
>> +}
>> +
>> +/*
>> + * The purpose of this function is to ensure the SEC comes out of reset
>> + * and IA initiates the SEC enabling Memory Scrambling.
>> + */
>> +void enable_scrambling(struct mrc_params *mrc_params)
>> +{
>> +       uint32_t lfsr = 0;
>> +       uint8_t i;
>> +
>> +       if (mrc_params->scrambling_enables == 0)
>> +               return;
>> +
>> +       ENTERFN();
>> +
>> +       /* 32 bit seed is always stored in BIOS NVM */
>> +       lfsr = mrc_params->timings.scrambler_seed;
>> +
>> +       if (mrc_params->boot_mode == BM_COLD) {
>> +               /*
>> +                * factory value is 0 and in first boot,
>> +                * a clock based seed is loaded.
>> +                */
>> +               if (lfsr == 0) {
>> +                       /*
>> +                        * get seed from system clock
>> +                        * and make sure it is not all 1's
>> +                        */
>> +                       lfsr = rdtsc() & 0x0FFFFFFF;
>> +               } else {
>> +                       /*
>> +                        * Need to replace scrambler
>> +                        *
>> +                        * get next 32bit LFSR 16 times which is the last
>> +                        * part of the previous scrambler vector
>> +                        */
>> +                       for (i = 0; i < 16; i++)
>> +                               lfsr32(&lfsr);
>> +               }
>> +
>> +               /* save new seed */
>> +               mrc_params->timings.scrambler_seed = lfsr;
>> +       }
>> +
>> +       /*
>> +        * In warm boot or S3 exit, we have the previous seed.
>> +        * In cold boot, we have the last 32bit LFSR which is the new seed.
>> +        */
>> +       lfsr32(&lfsr);  /* shift to next value */
>> +       msg_port_write(MEM_CTLR, SCRMSEED, (lfsr & 0x0003FFFF));
>> +
>> +       for (i = 0; i < 2; i++)
>> +               msg_port_write(MEM_CTLR, SCRMLO + i, (lfsr & 0xAAAAAAAA));
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Configure MCU Power Management Control Register
>> + * and Scheduler Control Register
>> + */
>> +void prog_ddr_control(struct mrc_params *mrc_params)
>> +{
>> +       u32 dsch;
>> +       u32 dpmc0;
>> +
>> +       ENTERFN();
>> +
>> +       dsch = msg_port_read(MEM_CTLR, DSCH);
>> +       dsch &= ~(BIT8 | BIT9 | BIT12);
>> +       msg_port_write(MEM_CTLR, DSCH, dsch);
>> +
>> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
>> +       dpmc0 &= ~BIT25;
>> +       dpmc0 |= (mrc_params->power_down_disable << 25);
>> +       dpmc0 &= ~BIT24;
>> +       dpmc0 &= ~(BIT16 | BIT17 | BIT18);
>> +       dpmc0 |= (4 << 16);
>> +       dpmc0 |= BIT21;
>> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
>> +
>> +       /* CMDTRIST = 2h - CMD/ADDR are tristated when no valid command */
>> +       mrc_write_mask(MEM_CTLR, DPMC1, 2 << 4, BIT4 | BIT5);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * After training complete configure MCU Rank Population Register
>> + * specifying: ranks enabled, device width, density, address mode
>> + */
>> +void prog_dra_drb(struct mrc_params *mrc_params)
>> +{
>> +       u32 drp;
>> +       u32 dco;
>> +       u8 density = mrc_params->params.density;
>> +
>> +       ENTERFN();
>> +
>> +       dco = msg_port_read(MEM_CTLR, DCO);
>> +       dco &= ~BIT31;
>> +       msg_port_write(MEM_CTLR, DCO, dco);
>> +
>> +       drp = 0;
>> +       if (mrc_params->rank_enables & 1)
>> +               drp |= BIT0;
>> +       if (mrc_params->rank_enables & 2)
>> +               drp |= BIT1;
>> +       if (mrc_params->dram_width == X16) {
>> +               drp |= (1 << 4);
>> +               drp |= (1 << 9);
>> +       }
>> +
>> +       /*
>> +        * Density encoding in struct dram_params: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb
>> +        * has to be mapped RANKDENSx encoding (0=1Gb)
>> +        */
>> +       if (density == 0)
>> +               density = 4;
>> +
>> +       drp |= ((density - 1) << 6);
>> +       drp |= ((density - 1) << 11);
>> +
>> +       /* Address mode can be overwritten if ECC enabled */
>> +       drp |= (mrc_params->address_mode << 14);
>> +
>> +       msg_port_write(MEM_CTLR, DRP, drp);
>> +
>> +       dco &= ~BIT28;
>> +       dco |= BIT31;
>> +       msg_port_write(MEM_CTLR, DCO, dco);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/* Send DRAM wake command */
>> +void perform_wake(struct mrc_params *mrc_params)
>> +{
>> +       ENTERFN();
>> +
>> +       dram_wake_command();
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Configure refresh rate and short ZQ calibration interval
>> + * Activate dynamic self refresh
>> + */
>> +void change_refresh_period(struct mrc_params *mrc_params)
>> +{
>> +       u32 drfc;
>> +       u32 dcal;
>> +       u32 dpmc0;
>> +
>> +       ENTERFN();
>> +
>> +       drfc = msg_port_read(MEM_CTLR, DRFC);
>> +       drfc &= ~(BIT12 | BIT13 | BIT14);
>> +       drfc |= (mrc_params->refresh_rate << 12);
>> +       drfc |= BIT21;
>> +       msg_port_write(MEM_CTLR, DRFC, drfc);
>> +
>> +       dcal = msg_port_read(MEM_CTLR, DCAL);
>> +       dcal &= ~(BIT8 | BIT9 | BIT10);
>> +       dcal |= (3 << 8);       /* 63ms */
>> +       msg_port_write(MEM_CTLR, DCAL, dcal);
>> +
>> +       dpmc0 = msg_port_read(MEM_CTLR, DPMC0);
>> +       dpmc0 |= (BIT23 | BIT29);
>> +       msg_port_write(MEM_CTLR, DPMC0, dpmc0);
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Configure DDRPHY for Auto-Refresh, Periodic Compensations,
>> + * Dynamic Diff-Amp, ZQSPERIOD, Auto-Precharge, CKE Power-Down
>> + */
>> +void set_auto_refresh(struct mrc_params *mrc_params)
>> +{
>> +       uint32_t channel;
>> +       uint32_t rank;
>> +       uint32_t bl;
>> +       uint32_t bl_divisor = 1;
>> +       uint32_t temp;
>> +
>> +       ENTERFN();
>> +
>> +       /*
>> +        * Enable Auto-Refresh, Periodic Compensations, Dynamic Diff-Amp,
>> +        * ZQSPERIOD, Auto-Precharge, CKE Power-Down
>> +        */
>> +       for (channel = 0; channel < NUM_CHANNELS; channel++) {
>> +               if (mrc_params->channel_enables & (1 << channel)) {
>> +                       /* Enable Periodic RCOMPS */
>> +                       mrc_alt_write_mask(DDRPHY, CMPCTRL, BIT1, BIT1);
>> +
>> +                       /* Enable Dynamic DiffAmp & Set Read ODT Value */
>> +                       switch (mrc_params->rd_odt_value) {
>> +                       case 0:
>> +                               temp = 0x3F;    /* OFF */
>> +                               break;
>> +                       default:
>> +                               temp = 0x00;    /* Auto */
>> +                               break;
>> +                       }
>> +
>> +                       for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) {
>> +                               /* Override: DIFFAMP, ODT */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B0OVRCTL + (bl * DDRIODQ_BL_OFFSET) +
>> +                                       (channel * DDRIODQ_CH_OFFSET)),
>> +                                       (0x00 << 16) | (temp << 10),
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 |
>> +                                        BIT17 | BIT16 | BIT15 | BIT14 |
>> +                                        BIT13 | BIT12 | BIT11 | BIT10));
>> +
>> +                               /* Override: DIFFAMP, ODT */
>> +                               mrc_alt_write_mask(DDRPHY,
>> +                                       (B1OVRCTL + (bl * DDRIODQ_BL_OFFSET) +
>> +                                       (channel * DDRIODQ_CH_OFFSET)),
>> +                                       (0x00 << 16) | (temp << 10),
>> +                                       (BIT21 | BIT20 | BIT19 | BIT18 |
>> +                                        BIT17 | BIT16 | BIT15 | BIT14 |
>> +                                        BIT13 | BIT12 | BIT11 | BIT10));
>> +                       }
>> +
>> +                       /* Issue ZQCS command */
>> +                       for (rank = 0; rank < NUM_RANKS; rank++) {
>> +                               if (mrc_params->rank_enables & (1 << rank))
>> +                                       dram_init_command(DCMD_ZQCS(rank));
>> +                       }
>> +               }
>> +       }
>> +
>> +       clear_pointers();
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Depending on configuration enables ECC support
>> + *
>> + * Available memory size is decreased, and updated with 0s
>> + * in order to clear error status. Address mode 2 forced.
>> + */
>> +void ecc_enable(struct mrc_params *mrc_params)
>> +{
>> +       u32 drp;
>> +       u32 dsch;
>> +       u32 ecc_ctrl;
>> +
>> +       if (mrc_params->ecc_enables == 0)
>> +               return;
>> +
>> +       ENTERFN();
>> +
>> +       /* Configuration required in ECC mode */
>> +       drp = msg_port_read(MEM_CTLR, DRP);
>> +       drp &= ~(BIT14 | BIT15);
>> +       drp |= BIT15;
>> +       drp |= BIT13;
>> +       msg_port_write(MEM_CTLR, DRP, drp);
>> +
>> +       /* Disable new request bypass */
>> +       dsch = msg_port_read(MEM_CTLR, DSCH);
>> +       dsch |= BIT12;
>> +       msg_port_write(MEM_CTLR, DSCH, dsch);
>> +
>> +       /* Enable ECC */
>> +       ecc_ctrl = (BIT0 | BIT1 | BIT17);
>> +       msg_port_write(MEM_CTLR, DECCCTRL, ecc_ctrl);
>> +
>> +       /* Assume 8 bank memory, one bank is gone for ECC */
>> +       mrc_params->mem_size -= mrc_params->mem_size / 8;
>> +
>> +       /* For S3 resume memory content has to be preserved */
>> +       if (mrc_params->boot_mode != BM_S3) {
>> +               select_hte();
>> +               hte_mem_init(mrc_params, MRC_MEM_INIT);
>> +               select_mem_mgr();
>> +       }
>> +
>> +       LEAVEFN();
>> +}
>> +
>> +/*
>> + * Execute memory test
>> + * if error detected it is indicated in mrc_params->status
>> + */
>> +void memory_test(struct mrc_params *mrc_params)
>> +{
>> +       uint32_t result = 0;
>> +
>> +       ENTERFN();
>> +
>> +       select_hte();
>> +       result = hte_mem_init(mrc_params, MRC_MEM_TEST);
>> +       select_mem_mgr();
>> +
>> +       DPF(D_INFO, "Memory test result %x\n", result);
>> +       mrc_params->status = ((result == 0) ? MRC_SUCCESS : MRC_E_MEMTEST);
>> +       LEAVEFN();
>> +}
>> +
>> +/* Lock MCU registers at the end of initialization sequence */
>> +void lock_registers(struct mrc_params *mrc_params)
>> +{
>> +       u32 dco;
>> +
>> +       ENTERFN();
>> +
>> +       dco = msg_port_read(MEM_CTLR, DCO);
>> +       dco &= ~(BIT28 | BIT29);
>> +       dco |= (BIT0 | BIT8);
>> +       msg_port_write(MEM_CTLR, DCO, dco);
>> +
>> +       LEAVEFN();
>> +}
>> diff --git a/arch/x86/cpu/quark/smc.h b/arch/x86/cpu/quark/smc.h
>> new file mode 100644
>> index 0000000..f774cb3
>> --- /dev/null
>> +++ b/arch/x86/cpu/quark/smc.h
>> @@ -0,0 +1,446 @@
>> +/*
>> + * Copyright (C) 2013, Intel Corporation
>> + * Copyright (C) 2015, Bin Meng <bmeng.cn@gmail.com>
>> + *
>> + * Ported from Intel released Quark UEFI BIOS
>> + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei/

Removed the ending / in v2.

>> + *
>> + * SPDX-License-Identifier:    Intel
>> + */
>> +
>> +#ifndef _SMC_H_
>> +#define _SMC_H_
>> +
>> +/* System Memory Controller Register Defines */
>> +
>> +/* Memory Controller Message Bus Registers Offsets */
>> +#define DRP                    0x00
>> +#define DTR0                   0x01
>> +#define DTR1                   0x02
>> +#define DTR2                   0x03
>> +#define DTR3                   0x04
>> +#define DTR4                   0x05
>> +#define DPMC0                  0x06
>> +#define DPMC1                  0x07
>> +#define DRFC                   0x08
>> +#define DSCH                   0x09
>> +#define DCAL                   0x0A
>> +#define DRMC                   0x0B
>> +#define PMSTS                  0x0C
>> +#define DCO                    0x0F
>> +#define DSTAT                  0x20
>> +#define SSKPD0                 0x4A
>> +#define SSKPD1                 0x4B
>> +#define DECCCTRL               0x60
>> +#define DECCSTAT               0x61
>> +#define DECCSBECNT             0x62
>> +#define DECCSBECA              0x68
>> +#define DECCSBECS              0x69
>> +#define DECCDBECA              0x6A
>> +#define DECCDBECS              0x6B
>> +#define DFUSESTAT              0x70
>> +#define SCRMSEED               0x80
>> +#define SCRMLO                 0x81
>> +#define SCRMHI                 0x82
>> +
>> +/* DRAM init command */
>> +#define DCMD_MRS1(rnk, dat)    (0 | ((rnk) << 22) | (1 << 3) | ((dat) << 6))
>> +#define DCMD_REF(rnk)          (1 | ((rnk) << 22))
>> +#define DCMD_PRE(rnk)          (2 | ((rnk) << 22))
>> +#define DCMD_PREA(rnk)         (2 | ((rnk) << 22) | (BIT10 << 6))
>> +#define DCMD_ACT(rnk, row)     (3 | ((rnk) << 22) | ((row) << 6))
>> +#define DCMD_WR(rnk, col)      (4 | ((rnk) << 22) | ((col) << 6))
>> +#define DCMD_RD(rnk, col)      (5 | ((rnk) << 22) | ((col) << 6))
>> +#define DCMD_ZQCS(rnk)         (6 | ((rnk) << 22))
>> +#define DCMD_ZQCL(rnk)         (6 | ((rnk) << 22) | (BIT10 << 6))
>> +#define DCMD_NOP(rnk)          (7 | ((rnk) << 22))
>
> We should have a #define for the 22 and a #define for the 6, and
> probably an enum for the 0, 1, 2, .. 7.

Acutally I have no idea about this DCMD macro group. They are
undocumented in the Intel datasheet. I suspect it is something in the
JEDEC DDR spec though. Did not change this in v2.

> Then the C code should ideally do:
>
> ENUM_NAME | (rnk << DCMD_XXX_SHIFT) | (col << DCMD_SHFIT)
>
> instead of
>
> DCMD_RD(rnk, col)
>
>> +
>> +#define DDR3_EMRS1_DIC_40      (0)
>> +#define DDR3_EMRS1_DIC_34      (1)
>> +
>> +#define DDR3_EMRS1_RTTNOM_0    (0)
>> +#define DDR3_EMRS1_RTTNOM_60   (BIT2)
>> +#define DDR3_EMRS1_RTTNOM_120  (BIT6)
>> +#define DDR3_EMRS1_RTTNOM_40   (BIT6 | BIT2)
>> +#define DDR3_EMRS1_RTTNOM_20   (BIT9)
>> +#define DDR3_EMRS1_RTTNOM_30   (BIT9 | BIT2)
>
> Let's right out the value here

Fixed

>> +
>> +#define DDR3_EMRS2_RTTWR_60    (BIT9)
>
> (1 << 9)

Fixed

>> +#define DDR3_EMRS2_RTTWR_120   (BIT10)
>
> (1 << 10)

Fixed

>> +
>> +/* BEGIN DDRIO Registers */
>> +
>> +/* DDR IOs & COMPs */
>> +#define DDRIODQ_BL_OFFSET      0x0800
>> +#define DDRIODQ_CH_OFFSET      ((NUM_BYTE_LANES / 2) * DDRIODQ_BL_OFFSET)
>> +#define DDRIOCCC_CH_OFFSET     0x0800
>> +#define DDRCOMP_CH_OFFSET      0x0100
>> +
>> +/* CH0-BL01-DQ */
>> +#define DQOBSCKEBBCTL          0x0000
>
> Are these accessed through the msg_port? If not, we could use a struct.

They are all accessed via msg port, and undocumented!!

>> +#define DQDLLTXCTL             0x0004
>> +#define DQDLLRXCTL             0x0008
>> +#define DQMDLLCTL              0x000C
>> +#define B0RXIOBUFCTL           0x0010
>> +#define B0VREFCTL              0x0014
>> +#define B0RXOFFSET1            0x0018
>> +#define B0RXOFFSET0            0x001C
>> +#define B1RXIOBUFCTL           0x0020
>> +#define B1VREFCTL              0x0024
>> +#define B1RXOFFSET1            0x0028
>> +#define B1RXOFFSET0            0x002C
>> +#define DQDFTCTL               0x0030
>> +#define DQTRAINSTS             0x0034
>> +#define B1DLLPICODER0          0x0038
>> +#define B0DLLPICODER0          0x003C
>> +#define B1DLLPICODER1          0x0040
>> +#define B0DLLPICODER1          0x0044
>> +#define B1DLLPICODER2          0x0048
>> +#define B0DLLPICODER2          0x004C
>> +#define B1DLLPICODER3          0x0050
>> +#define B0DLLPICODER3          0x0054
>> +#define B1RXDQSPICODE          0x0058
>> +#define B0RXDQSPICODE          0x005C
>> +#define B1RXDQPICODER32                0x0060
>> +#define B1RXDQPICODER10                0x0064
>> +#define B0RXDQPICODER32                0x0068
>> +#define B0RXDQPICODER10                0x006C
>> +#define B01PTRCTL0             0x0070
>> +#define B01PTRCTL1             0x0074
>> +#define B01DBCTL0              0x0078
>> +#define B01DBCTL1              0x007C
>> +#define B0LATCTL0              0x0080
>> +#define B1LATCTL0              0x0084
>> +#define B01LATCTL1             0x0088
>> +#define B0ONDURCTL             0x008C
>> +#define B1ONDURCTL             0x0090
>> +#define B0OVRCTL               0x0094
>> +#define B1OVRCTL               0x0098
>> +#define DQCTL                  0x009C
>> +#define B0RK2RKCHGPTRCTRL      0x00A0
>> +#define B1RK2RKCHGPTRCTRL      0x00A4
>> +#define DQRK2RKCTL             0x00A8
>> +#define DQRK2RKPTRCTL          0x00AC
>> +#define B0RK2RKLAT             0x00B0
>> +#define B1RK2RKLAT             0x00B4
>> +#define DQCLKALIGNREG0         0x00B8
>> +#define DQCLKALIGNREG1         0x00BC
>> +#define DQCLKALIGNREG2         0x00C0
>> +#define DQCLKALIGNSTS0         0x00C4
>> +#define DQCLKALIGNSTS1         0x00C8
>> +#define DQCLKGATE              0x00CC
>> +#define B0COMPSLV1             0x00D0
>> +#define B1COMPSLV1             0x00D4
>> +#define B0COMPSLV2             0x00D8
>> +#define B1COMPSLV2             0x00DC
>> +#define B0COMPSLV3             0x00E0
>> +#define B1COMPSLV3             0x00E4
>> +#define DQVISALANECR0TOP       0x00E8
>> +#define DQVISALANECR1TOP       0x00EC
>> +#define DQVISACONTROLCRTOP     0x00F0
>> +#define DQVISALANECR0BL                0x00F4
>> +#define DQVISALANECR1BL                0x00F8
>> +#define DQVISACONTROLCRBL      0x00FC
>> +#define DQTIMINGCTRL           0x010C
>> +
>> +/* CH0-ECC */
>> +#define ECCDLLTXCTL            0x2004
>> +#define ECCDLLRXCTL            0x2008
>> +#define ECCMDLLCTL             0x200C
>> +#define ECCB1DLLPICODER0       0x2038
>> +#define ECCB1DLLPICODER1       0x2040
>> +#define ECCB1DLLPICODER2       0x2048
>> +#define ECCB1DLLPICODER3       0x2050
>> +#define ECCB01DBCTL0           0x2078
>> +#define ECCB01DBCTL1           0x207C
>> +#define ECCCLKALIGNREG0                0x20B8
>> +#define ECCCLKALIGNREG1                0x20BC
>> +#define ECCCLKALIGNREG2                0x20C0
>> +
>> +/* CH0-CMD */
>> +#define CMDOBSCKEBBCTL         0x4800
>> +#define CMDDLLTXCTL            0x4808
>> +#define CMDDLLRXCTL            0x480C
>> +#define CMDMDLLCTL             0x4810
>> +#define CMDRCOMPODT            0x4814
>> +#define CMDDLLPICODER0         0x4820
>> +#define CMDDLLPICODER1         0x4824
>> +#define CMDCFGREG0             0x4840
>> +#define CMDPTRREG              0x4844
>> +#define CMDCLKALIGNREG0                0x4850
>> +#define CMDCLKALIGNREG1                0x4854
>> +#define CMDCLKALIGNREG2                0x4858
>> +#define CMDPMCONFIG0           0x485C
>> +#define CMDPMDLYREG0           0x4860
>> +#define CMDPMDLYREG1           0x4864
>> +#define CMDPMDLYREG2           0x4868
>> +#define CMDPMDLYREG3           0x486C
>> +#define CMDPMDLYREG4           0x4870
>> +#define CMDCLKALIGNSTS0                0x4874
>> +#define CMDCLKALIGNSTS1                0x4878
>> +#define CMDPMSTS0              0x487C
>> +#define CMDPMSTS1              0x4880
>> +#define CMDCOMPSLV             0x4884
>> +#define CMDBONUS0              0x488C
>> +#define CMDBONUS1              0x4890
>> +#define CMDVISALANECR0         0x4894
>> +#define CMDVISALANECR1         0x4898
>> +#define CMDVISACONTROLCR       0x489C
>> +#define CMDCLKGATE             0x48A0
>> +#define CMDTIMINGCTRL          0x48A4
>> +
>> +/* CH0-CLK-CTL */
>> +#define CCOBSCKEBBCTL          0x5800
>> +#define CCRCOMPIO              0x5804
>> +#define CCDLLTXCTL             0x5808
>> +#define CCDLLRXCTL             0x580C
>> +#define CCMDLLCTL              0x5810
>> +#define CCRCOMPODT             0x5814
>> +#define CCDLLPICODER0          0x5820
>> +#define CCDLLPICODER1          0x5824
>> +#define CCDDR3RESETCTL         0x5830
>> +#define CCCFGREG0              0x5838
>> +#define CCCFGREG1              0x5840
>> +#define CCPTRREG               0x5844
>> +#define CCCLKALIGNREG0         0x5850
>> +#define CCCLKALIGNREG1         0x5854
>> +#define CCCLKALIGNREG2         0x5858
>> +#define CCPMCONFIG0            0x585C
>> +#define CCPMDLYREG0            0x5860
>> +#define CCPMDLYREG1            0x5864
>> +#define CCPMDLYREG2            0x5868
>> +#define CCPMDLYREG3            0x586C
>> +#define CCPMDLYREG4            0x5870
>> +#define CCCLKALIGNSTS0         0x5874
>> +#define CCCLKALIGNSTS1         0x5878
>> +#define CCPMSTS0               0x587C
>> +#define CCPMSTS1               0x5880
>> +#define CCCOMPSLV1             0x5884
>> +#define CCCOMPSLV2             0x5888
>> +#define CCCOMPSLV3             0x588C
>> +#define CCBONUS0               0x5894
>> +#define CCBONUS1               0x5898
>> +#define CCVISALANECR0          0x589C
>> +#define CCVISALANECR1          0x58A0
>> +#define CCVISACONTROLCR                0x58A4
>> +#define CCCLKGATE              0x58A8
>> +#define CCTIMINGCTL            0x58AC
>> +
>> +/* COMP */
>> +#define CMPCTRL                        0x6800
>> +#define SOFTRSTCNTL            0x6804
>> +#define MSCNTR                 0x6808
>> +#define NMSCNTRL               0x680C
>> +#define LATCH1CTL              0x6814
>> +#define COMPVISALANECR0                0x681C
>> +#define COMPVISALANECR1                0x6820
>> +#define COMPVISACONTROLCR      0x6824
>> +#define COMPBONUS0             0x6830
>> +#define TCOCNTCTRL             0x683C
>> +#define DQANAODTPUCTL          0x6840
>> +#define DQANAODTPDCTL          0x6844
>> +#define DQANADRVPUCTL          0x6848
>> +#define DQANADRVPDCTL          0x684C
>> +#define DQANADLYPUCTL          0x6850
>> +#define DQANADLYPDCTL          0x6854
>> +#define DQANATCOPUCTL          0x6858
>> +#define DQANATCOPDCTL          0x685C
>> +#define CMDANADRVPUCTL         0x6868
>> +#define CMDANADRVPDCTL         0x686C
>> +#define CMDANADLYPUCTL         0x6870
>> +#define CMDANADLYPDCTL         0x6874
>> +#define CLKANAODTPUCTL         0x6880
>> +#define CLKANAODTPDCTL         0x6884
>> +#define CLKANADRVPUCTL         0x6888
>> +#define CLKANADRVPDCTL         0x688C
>> +#define CLKANADLYPUCTL         0x6890
>> +#define CLKANADLYPDCTL         0x6894
>> +#define CLKANATCOPUCTL         0x6898
>> +#define CLKANATCOPDCTL         0x689C
>> +#define DQSANAODTPUCTL         0x68A0
>> +#define DQSANAODTPDCTL         0x68A4
>> +#define DQSANADRVPUCTL         0x68A8
>> +#define DQSANADRVPDCTL         0x68AC
>> +#define DQSANADLYPUCTL         0x68B0
>> +#define DQSANADLYPDCTL         0x68B4
>> +#define DQSANATCOPUCTL         0x68B8
>> +#define DQSANATCOPDCTL         0x68BC
>> +#define CTLANADRVPUCTL         0x68C8
>> +#define CTLANADRVPDCTL         0x68CC
>> +#define CTLANADLYPUCTL         0x68D0
>> +#define CTLANADLYPDCTL         0x68D4
>> +#define CHNLBUFSTATIC          0x68F0
>> +#define COMPOBSCNTRL           0x68F4
>> +#define COMPBUFFDBG0           0x68F8
>> +#define COMPBUFFDBG1           0x68FC
>> +#define CFGMISCCH0             0x6900
>> +#define COMPEN0CH0             0x6904
>> +#define COMPEN1CH0             0x6908
>> +#define COMPEN2CH0             0x690C
>> +#define STATLEGEN0CH0          0x6910
>> +#define STATLEGEN1CH0          0x6914
>> +#define DQVREFCH0              0x6918
>> +#define CMDVREFCH0             0x691C
>> +#define CLKVREFCH0             0x6920
>> +#define DQSVREFCH0             0x6924
>> +#define CTLVREFCH0             0x6928
>> +#define TCOVREFCH0             0x692C
>> +#define DLYSELCH0              0x6930
>> +#define TCODRAMBUFODTCH0       0x6934
>> +#define CCBUFODTCH0            0x6938
>> +#define RXOFFSETCH0            0x693C
>> +#define DQODTPUCTLCH0          0x6940
>> +#define DQODTPDCTLCH0          0x6944
>> +#define DQDRVPUCTLCH0          0x6948
>> +#define DQDRVPDCTLCH0          0x694C
>> +#define DQDLYPUCTLCH0          0x6950
>> +#define DQDLYPDCTLCH0          0x6954
>> +#define DQTCOPUCTLCH0          0x6958
>> +#define DQTCOPDCTLCH0          0x695C
>> +#define CMDDRVPUCTLCH0         0x6968
>> +#define CMDDRVPDCTLCH0         0x696C
>> +#define CMDDLYPUCTLCH0         0x6970
>> +#define CMDDLYPDCTLCH0         0x6974
>> +#define CLKODTPUCTLCH0         0x6980
>> +#define CLKODTPDCTLCH0         0x6984
>> +#define CLKDRVPUCTLCH0         0x6988
>> +#define CLKDRVPDCTLCH0         0x698C
>> +#define CLKDLYPUCTLCH0         0x6990
>> +#define CLKDLYPDCTLCH0         0x6994
>> +#define CLKTCOPUCTLCH0         0x6998
>> +#define CLKTCOPDCTLCH0         0x699C
>> +#define DQSODTPUCTLCH0         0x69A0
>> +#define DQSODTPDCTLCH0         0x69A4
>> +#define DQSDRVPUCTLCH0         0x69A8
>> +#define DQSDRVPDCTLCH0         0x69AC
>> +#define DQSDLYPUCTLCH0         0x69B0
>> +#define DQSDLYPDCTLCH0         0x69B4
>> +#define DQSTCOPUCTLCH0         0x69B8
>> +#define DQSTCOPDCTLCH0         0x69BC
>> +#define CTLDRVPUCTLCH0         0x69C8
>> +#define CTLDRVPDCTLCH0         0x69CC
>> +#define CTLDLYPUCTLCH0         0x69D0
>> +#define CTLDLYPDCTLCH0         0x69D4
>> +#define FNLUPDTCTLCH0          0x69F0
>> +
>> +/* PLL */
>> +#define MPLLCTRL0              0x7800
>> +#define MPLLCTRL1              0x7808
>> +#define MPLLCSR0               0x7810
>> +#define MPLLCSR1               0x7814
>> +#define MPLLCSR2               0x7820
>> +#define MPLLDFT                        0x7828
>> +#define MPLLMON0CTL            0x7830
>> +#define MPLLMON1CTL            0x7838
>> +#define MPLLMON2CTL            0x783C
>> +#define SFRTRIM                        0x7850
>> +#define MPLLDFTOUT0            0x7858
>> +#define MPLLDFTOUT1            0x785C
>> +#define MASTERRSTN             0x7880
>> +#define PLLLOCKDEL             0x7884
>> +#define SFRDEL                 0x7888
>> +#define CRUVISALANECR0         0x78F0
>> +#define CRUVISALANECR1         0x78F4
>> +#define CRUVISACONTROLCR       0x78F8
>> +#define IOSFVISALANECR0                0x78FC
>> +#define IOSFVISALANECR1                0x7900
>> +#define IOSFVISACONTROLCR      0x7904
>> +
>> +/* END DDRIO Registers */
>> +
>> +/* DRAM Specific Message Bus OpCodes */
>> +#define MSG_OP_DRAM_INIT       0x68
>> +#define MSG_OP_DRAM_WAKE       0xCA
>> +
>> +#define SAMPLE_SIZE            6
>> +
>> +/* must be less than this number to enable early deadband */
>> +#define EARLY_DB               0x12
>> +/* must be greater than this number to enable late deadband */
>> +#define LATE_DB                        0x34
>> +
>> +#define CHX_REGS               (11 * 4)
>> +#define FULL_CLK               128
>> +#define HALF_CLK               64
>> +#define QRTR_CLK               32
>> +
>> +#define MCEIL(num, den)                ((uint8_t)((num + den - 1) / den))
>> +#define MMAX(a, b)             ((a) > (b) ? (a) : (b))
>> +#define DEAD_LOOP()            for (;;);
>> +
>> +#define MIN_RDQS_EYE           10      /* in PI Codes */
>> +#define MIN_VREF_EYE           10      /* in VREF Codes */
>> +/* how many RDQS codes to jump while margining */
>> +#define RDQS_STEP              1
>> +/* how many VREF codes to jump while margining */
>> +#define VREF_STEP              1
>> +/* offset into "vref_codes[]" for minimum allowed VREF setting */
>> +#define VREF_MIN               0x00
>> +/* offset into "vref_codes[]" for maximum allowed VREF setting */
>> +#define VREF_MAX               0x3F
>> +#define RDQS_MIN               0x00    /* minimum RDQS delay value */
>> +#define RDQS_MAX               0x3F    /* maximum RDQS delay value */
>> +
>> +/* how many WDQ codes to jump while margining */
>> +#define WDQ_STEP               1
>> +
>> +enum {
>> +       B,      /* BOTTOM VREF */
>> +       T       /* TOP VREF */
>> +};
>> +
>> +enum {
>> +       L,      /* LEFT RDQS */
>> +       R       /* RIGHT RDQS */
>> +};
>> +
>> +/* Memory Options */
>> +
>> +/* enable STATIC timing settings for RCVN (BACKUP_MODE) */
>> +#undef BACKUP_RCVN
>> +/* enable STATIC timing settings for WDQS (BACKUP_MODE) */
>> +#undef BACKUP_WDQS
>> +/* enable STATIC timing settings for RDQS (BACKUP_MODE) */
>> +#undef BACKUP_RDQS
>> +/* enable STATIC timing settings for WDQ (BACKUP_MODE) */
>> +#undef BACKUP_WDQ
>> +/* enable *COMP overrides (BACKUP_MODE) */
>> +#undef BACKUP_COMPS
>> +/* enable the RD_TRAIN eye check */
>> +#undef RX_EYE_CHECK
>> +
>> +/* enable Host to Memory Clock Alignment */
>> +#define HMC_TEST
>> +/* enable multi-rank support via rank2rank sharing */
>> +#define R2R_SHARING
>> +/* disable signals not used in 16bit mode of DDRIO */
>> +#define FORCE_16BIT_DDRIO
>> +
>> +#define PLATFORM_ID            1
>> +
>> +void clear_self_refresh(struct mrc_params *mrc_params);
>> +void prog_ddr_timing_control(struct mrc_params *mrc_params);
>> +void prog_decode_before_jedec(struct mrc_params *mrc_params);
>> +void perform_ddr_reset(struct mrc_params *mrc_params);
>> +void ddrphy_init(struct mrc_params *mrc_params);
>> +void perform_jedec_init(struct mrc_params *mrc_params);
>> +void set_ddr_init_complete(struct mrc_params *mrc_params);
>> +void restore_timings(struct mrc_params *mrc_params);
>> +void default_timings(struct mrc_params *mrc_params);
>> +void rcvn_cal(struct mrc_params *mrc_params);
>> +void wr_level(struct mrc_params *mrc_params);
>> +void prog_page_ctrl(struct mrc_params *mrc_params);
>> +void rd_train(struct mrc_params *mrc_params);
>> +void wr_train(struct mrc_params *mrc_params);
>> +void store_timings(struct mrc_params *mrc_params);
>> +void enable_scrambling(struct mrc_params *mrc_params);
>> +void prog_ddr_control(struct mrc_params *mrc_params);
>> +void prog_dra_drb(struct mrc_params *mrc_params);
>> +void perform_wake(struct mrc_params *mrc_params);
>> +void change_refresh_period(struct mrc_params *mrc_params);
>> +void set_auto_refresh(struct mrc_params *mrc_params);
>> +void ecc_enable(struct mrc_params *mrc_params);
>> +void memory_test(struct mrc_params *mrc_params);
>> +void lock_registers(struct mrc_params *mrc_params);
>
> Function comments should go here.

They are only used internally in MRC.

>> +
>> +#endif /* _SMC_H_ */
>> --
>> 1.8.2.1
>>
>
> Regards,
> Simon

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-05  6:58       ` Bin Meng
@ 2015-02-06  0:18         ` Albert ARIBAUD
  2015-02-06  1:00           ` Bin Meng
  0 siblings, 1 reply; 29+ messages in thread
From: Albert ARIBAUD @ 2015-02-06  0:18 UTC (permalink / raw)
  To: u-boot

Hello Bin,

On Thu, 5 Feb 2015 14:58:35 +0800, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Simon,
> 
> On Thu, Feb 5, 2015 at 6:35 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
> > Hi Simon,
> >
> > On Thu, Feb 5, 2015 at 12:25 AM, Simon Glass <sjg@chromium.org> wrote:
> >> Hi Bin,
> >>
> >> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> >>> Turn on the Memory Reference code build in the quark Makefile.
> >>>
> >>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> >>> ---
> >>>
> >>>  arch/x86/cpu/quark/Makefile | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
> >>> index 168c1e6..e87b424 100644
> >>> --- a/arch/x86/cpu/quark/Makefile
> >>> +++ b/arch/x86/cpu/quark/Makefile
> >>> @@ -5,4 +5,5 @@
> >>>  #
> >>>
> >>>  obj-y += car.o dram.o msg_port.o quark.o
> >>> +obj-y += mrc.o mrc_util.o hte.o smc.o
> >>>  obj-$(CONFIG_PCI) += pci.o
> >>
> >> Would prefer that you do this as you add each file (i.e. in the patch
> >> that adds the file).
> >>
> >
> > OK, will squash this one to previous commits.
> 
> Sorry I was replying too fast. Looks that I cannot add each file to
> Makefile each time, because it will not build until the 3rd patch is
> in place to provide all header files needed.

Can't you reorder the patches so that things build properly at each
addition? IF you can't, then there is a cross-dependency, and the
cross-dependent patches (and only these!) should be made into a single
patch.

> Regards,
> Bin

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-06  0:18         ` Albert ARIBAUD
@ 2015-02-06  1:00           ` Bin Meng
  2015-02-06  5:23             ` Simon Glass
  0 siblings, 1 reply; 29+ messages in thread
From: Bin Meng @ 2015-02-06  1:00 UTC (permalink / raw)
  To: u-boot

Hi Albert,

On Fri, Feb 6, 2015 at 8:18 AM, Albert ARIBAUD
<albert.u.boot@aribaud.net> wrote:
> Hello Bin,
>
> On Thu, 5 Feb 2015 14:58:35 +0800, Bin Meng <bmeng.cn@gmail.com> wrote:
>> Hi Simon,
>>
>> On Thu, Feb 5, 2015 at 6:35 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
>> > Hi Simon,
>> >
>> > On Thu, Feb 5, 2015 at 12:25 AM, Simon Glass <sjg@chromium.org> wrote:
>> >> Hi Bin,
>> >>
>> >> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>> >>> Turn on the Memory Reference code build in the quark Makefile.
>> >>>
>> >>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>> >>> ---
>> >>>
>> >>>  arch/x86/cpu/quark/Makefile | 1 +
>> >>>  1 file changed, 1 insertion(+)
>> >>>
>> >>> diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
>> >>> index 168c1e6..e87b424 100644
>> >>> --- a/arch/x86/cpu/quark/Makefile
>> >>> +++ b/arch/x86/cpu/quark/Makefile
>> >>> @@ -5,4 +5,5 @@
>> >>>  #
>> >>>
>> >>>  obj-y += car.o dram.o msg_port.o quark.o
>> >>> +obj-y += mrc.o mrc_util.o hte.o smc.o
>> >>>  obj-$(CONFIG_PCI) += pci.o
>> >>
>> >> Would prefer that you do this as you add each file (i.e. in the patch
>> >> that adds the file).
>> >>
>> >
>> > OK, will squash this one to previous commits.
>>
>> Sorry I was replying too fast. Looks that I cannot add each file to
>> Makefile each time, because it will not build until the 3rd patch is
>> in place to provide all header files needed.
>
> Can't you reorder the patches so that things build properly at each
> addition? IF you can't, then there is a cross-dependency, and the
> cross-dependent patches (and only these!) should be made into a single
> patch.
>

I wanted to put them all together as a single patch before, but there
is a 100KB email limit on this mailing list. So each patch was created
like this, a <xxx.c> with its header <xxx.h>. Yes, I could just put
all header files into one patch. Then add each <xxx.c> as a single
patch and enable the build. Anyway, so far this patch series does not
break any bisectability.

Regards,
Bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-06  1:00           ` Bin Meng
@ 2015-02-06  5:23             ` Simon Glass
  2015-02-06 20:27               ` Simon Glass
  0 siblings, 1 reply; 29+ messages in thread
From: Simon Glass @ 2015-02-06  5:23 UTC (permalink / raw)
  To: u-boot

On 5 February 2015 at 18:00, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Albert,
>
> On Fri, Feb 6, 2015 at 8:18 AM, Albert ARIBAUD
> <albert.u.boot@aribaud.net> wrote:
>> Hello Bin,
>>
>> On Thu, 5 Feb 2015 14:58:35 +0800, Bin Meng <bmeng.cn@gmail.com> wrote:
>>> Hi Simon,
>>>
>>> On Thu, Feb 5, 2015 at 6:35 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
>>> > Hi Simon,
>>> >
>>> > On Thu, Feb 5, 2015 at 12:25 AM, Simon Glass <sjg@chromium.org> wrote:
>>> >> Hi Bin,
>>> >>
>>> >> On 3 February 2015 at 04:45, Bin Meng <bmeng.cn@gmail.com> wrote:
>>> >>> Turn on the Memory Reference code build in the quark Makefile.
>>> >>>
>>> >>> Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
>>> >>> ---
>>> >>>
>>> >>>  arch/x86/cpu/quark/Makefile | 1 +
>>> >>>  1 file changed, 1 insertion(+)
>>> >>>
>>> >>> diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile
>>> >>> index 168c1e6..e87b424 100644
>>> >>> --- a/arch/x86/cpu/quark/Makefile
>>> >>> +++ b/arch/x86/cpu/quark/Makefile
>>> >>> @@ -5,4 +5,5 @@
>>> >>>  #
>>> >>>
>>> >>>  obj-y += car.o dram.o msg_port.o quark.o
>>> >>> +obj-y += mrc.o mrc_util.o hte.o smc.o
>>> >>>  obj-$(CONFIG_PCI) += pci.o
>>> >>
>>> >> Would prefer that you do this as you add each file (i.e. in the patch
>>> >> that adds the file).
>>> >>
>>> >
>>> > OK, will squash this one to previous commits.
>>>
>>> Sorry I was replying too fast. Looks that I cannot add each file to
>>> Makefile each time, because it will not build until the 3rd patch is
>>> in place to provide all header files needed.
>>
>> Can't you reorder the patches so that things build properly at each
>> addition? IF you can't, then there is a cross-dependency, and the
>> cross-dependent patches (and only these!) should be made into a single
>> patch.
>>
>
> I wanted to put them all together as a single patch before, but there
> is a 100KB email limit on this mailing list. So each patch was created
> like this, a <xxx.c> with its header <xxx.h>. Yes, I could just put
> all header files into one patch. Then add each <xxx.c> as a single
> patch and enable the build. Anyway, so far this patch series does not
> break any bisectability.

I'm happy enough with this.

Acked-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build
  2015-02-06  5:23             ` Simon Glass
@ 2015-02-06 20:27               ` Simon Glass
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Glass @ 2015-02-06 20:27 UTC (permalink / raw)
  To: u-boot

[snip]
> I'm happy enough with this.
>
> Acked-by: Simon Glass <sjg@chromium.org>

Applied to u-boot-x86, thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-02-06 20:27 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-03 11:45 [U-Boot] [RFC PATCH 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support Bin Meng
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 1/9] x86: Allow overriding TSC_FREQ_IN_MHZ Bin Meng
2015-02-04 16:24   ` Simon Glass
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 2/9] x86: quark: Bypass TSC calibration Bin Meng
2015-02-04 16:24   ` Simon Glass
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 3/9] x86: quark: Add Memory Reference Code (MRC) main routines Bin Meng
2015-02-04 16:24   ` Simon Glass
2015-02-05  8:45     ` Bin Meng
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 4/9] x86: quark: Add utility codes needed for MRC Bin Meng
2015-02-04 16:24   ` Simon Glass
2015-02-05 14:25     ` Bin Meng
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 5/9] x86: quark: Add System Memory Controller support Bin Meng
2015-02-04 16:24   ` Simon Glass
2015-02-05 15:17     ` Bin Meng
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 6/9] x86: quark: Enable the Memory Reference Code build Bin Meng
2015-02-04 16:25   ` Simon Glass
2015-02-04 22:35     ` Bin Meng
2015-02-05  6:58       ` Bin Meng
2015-02-06  0:18         ` Albert ARIBAUD
2015-02-06  1:00           ` Bin Meng
2015-02-06  5:23             ` Simon Glass
2015-02-06 20:27               ` Simon Glass
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 7/9] fdtdec: Add compatible id and string for Intel Quark MRC Bin Meng
2015-02-04 16:25   ` Simon Glass
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 8/9] dt-bindings: Add Intel Quark MRC bindings Bin Meng
2015-02-04 16:25   ` Simon Glass
2015-02-03 11:45 ` [U-Boot] [RFC PATCH 9/9] x86: quark: Call MRC in dram_init() Bin Meng
2015-02-04 16:25   ` Simon Glass
2015-02-04 22:54     ` Bin Meng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.