All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
@ 2023-01-17  5:34 Tony Dinh
  2023-01-17  8:35 ` Pali Rohár
  0 siblings, 1 reply; 8+ messages in thread
From: Tony Dinh @ 2023-01-17  5:34 UTC (permalink / raw)
  To: U-Boot Mailing List, Stefan Roese, Pali Roh�r,
	Marek Beh�n, Chris Packham
  Cc: Jaehoon Chung, Mark Kettenis, Simon Glass, Michael Trimarchi,
	Tom Rini, Tony Dinh, Marek Behún

    This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
    https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git

    up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
    d5acc10c287e40cc2feeb28710b92e45c93c702c

    This patch was created by following steps:

    1. Replace all a38x files in U-Boot tree by files from upstream github
       Marvell mv-ddr-marvell repository.

    2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:

        files=drivers/ddr/marvell/a38x/*
        sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
        unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
            -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
            -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
            -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files

    3. Manually change license to SPDX-License-Identifier
       (upstream license in  upstream github repository contains long license
       texts and U-Boot is using just SPDX-License-Identifier.

    After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
    repository and in U-Boot would be fully identical. So in future applying
    above steps could be used to sync code again.

    The only change in this patch are:
    - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
    - Some fixes with include files.
    - Some basic type defines (original from ATF headers) in mv_ddr_plat.c

    Reference:
    "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
    https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460

Signed-off-by: Tony Dinh <mibodhi@gmail.com>
---

Changes in v2:
- Modified the filter scrip to explicitly include ARMADA_38X code
and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
drivers/ddr/marvell/a38x/
- Updated script
files=drivers/ddr/marvell/a38x/*
sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
                -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
                -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
                -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
                                -UCONFIG_64BIT $files
- Remove more dead code files
- Correct SPDX license header

 drivers/ddr/marvell/a38x/Makefile             |    8 +
 drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
 drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
 drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
 drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
 drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
 drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
 .../a38x/ddr3_training_centralization.c       |    4 +
 drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
 drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
 .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
 .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
 .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
 .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
 drivers/ddr/marvell/a38x/dram_if.h            |   13 -
 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
 drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
 drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
 .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
 .../a38x/mv_ddr4_training_calibration.h       |   26 +
 .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
 .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
 .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
 drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
 drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
 drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
 drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
 28 files changed, 5996 insertions(+), 13 deletions(-)
 delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
 create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h

diff --git a/drivers/ddr/marvell/a38x/Makefile b/drivers/ddr/marvell/a38x/Makefile
index 8251d6db66..fcfb615686 100644
--- a/drivers/ddr/marvell/a38x/Makefile
+++ b/drivers/ddr/marvell/a38x/Makefile
@@ -17,3 +17,11 @@ obj-$(CONFIG_SPL_BUILD) += mv_ddr_common.o
 obj-$(CONFIG_SPL_BUILD) += mv_ddr_spd.o
 obj-$(CONFIG_SPL_BUILD) += mv_ddr_topology.o
 obj-$(CONFIG_SPL_BUILD) += xor.o
+
+ifdef CONFIG_DDR4
+	obj-$(CONFIG_SPL_BUILD) += mv_ddr4_mpr_pda_if.o
+	obj-$(CONFIG_SPL_BUILD) += mv_ddr4_training.o
+	obj-$(CONFIG_SPL_BUILD) += mv_ddr4_training_calibration.o
+	obj-$(CONFIG_SPL_BUILD) += mv_ddr4_training_db.o
+	obj-$(CONFIG_SPL_BUILD) += mv_ddr4_training_leveling.o
+endif
diff --git a/drivers/ddr/marvell/a38x/ddr3_debug.c b/drivers/ddr/marvell/a38x/ddr3_debug.c
index f5fc964d6f..9e499cfb99 100644
--- a/drivers/ddr/marvell/a38x/ddr3_debug.c
+++ b/drivers/ddr/marvell/a38x/ddr3_debug.c
@@ -30,6 +30,12 @@ u8 debug_training_hw_alg = DEBUG_LEVEL_ERROR;
 u8 debug_training_access = DEBUG_LEVEL_ERROR;
 u8 debug_training_device = DEBUG_LEVEL_ERROR;
 
+#if defined(CONFIG_DDR4)
+u8 debug_tap_tuning = DEBUG_LEVEL_ERROR;
+u8 debug_calibration = DEBUG_LEVEL_ERROR;
+u8 debug_ddr4_centralization = DEBUG_LEVEL_ERROR;
+u8 debug_dm_tuning = DEBUG_LEVEL_ERROR;
+#endif /* CONFIG_DDR4 */
 
 void mv_ddr_user_log_level_set(enum ddr_lib_debug_block block)
 {
@@ -70,6 +76,17 @@ void ddr3_hws_set_log_level(enum ddr_lib_debug_block block, u8 level)
 		else
 			is_reg_dump = 0;
 		break;
+#if defined(CONFIG_DDR4)
+	case DEBUG_TAP_TUNING_ENGINE:
+		debug_tap_tuning = level;
+		break;
+	case DEBUG_BLOCK_CALIBRATION:
+		debug_calibration = level;
+		break;
+	case DEBUG_BLOCK_DDR4_CENTRALIZATION:
+		debug_ddr4_centralization = level;
+		break;
+#endif /* CONFIG_DDR4 */
 	case DEBUG_BLOCK_ALL:
 	default:
 		debug_training_static = level;
@@ -80,6 +97,11 @@ void ddr3_hws_set_log_level(enum ddr_lib_debug_block block, u8 level)
 		debug_training_hw_alg = level;
 		debug_training_access = level;
 		debug_training_device = level;
+#if defined(CONFIG_DDR4)
+		debug_tap_tuning = level;
+		debug_calibration = level;
+		debug_ddr4_centralization = level;
+#endif /* CONFIG_DDR4 */
 	}
 }
 #endif /* SILENT_LIB */
@@ -209,11 +231,13 @@ static char *convert_freq(enum mv_ddr_freq freq)
 	case MV_DDR_FREQ_LOW_FREQ:
 		return "MV_DDR_FREQ_LOW_FREQ";
 
+#if !defined(CONFIG_DDR4)
 	case MV_DDR_FREQ_400:
 		return "400";
 
 	case MV_DDR_FREQ_533:
 		return "533";
+#endif /* CONFIG_DDR4 */
 
 	case MV_DDR_FREQ_667:
 		return "667";
@@ -227,6 +251,7 @@ static char *convert_freq(enum mv_ddr_freq freq)
 	case MV_DDR_FREQ_1066:
 		return "1066";
 
+#if !defined(CONFIG_DDR4)
 	case MV_DDR_FREQ_311:
 		return "311";
 
@@ -247,6 +272,7 @@ static char *convert_freq(enum mv_ddr_freq freq)
 
 	case MV_DDR_FREQ_1000:
 		return "MV_DDR_FREQ_1000";
+#endif /* CONFIG_DDR4 */
 
 	default:
 		return "Unknown Frequency";
@@ -463,6 +489,7 @@ int ddr3_tip_print_log(u32 dev_num, u32 mem_addr)
 					   (training_result[WRITE_LEVELING_TF]
 					    [if_id])));
 		}
+#if !defined(CONFIG_DDR4)
 		if (mask_tune_func & READ_LEVELING_TF_MASK_BIT) {
 			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
 					  ("\tRL TF: %s\n",
@@ -470,6 +497,7 @@ int ddr3_tip_print_log(u32 dev_num, u32 mem_addr)
 					   (training_result[READ_LEVELING_TF]
 					    [if_id])));
 		}
+#endif /* CONFIG_DDR4 */
 		if (mask_tune_func & WRITE_LEVELING_SUPP_TF_MASK_BIT) {
 			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
 					  ("\tWL TF Supp: %s\n",
@@ -499,6 +527,43 @@ int ddr3_tip_print_log(u32 dev_num, u32 mem_addr)
 					   (training_result[CENTRALIZATION_TX]
 					    [if_id])));
 		}
+#if defined(CONFIG_DDR4)
+		if (mask_tune_func & SW_READ_LEVELING_MASK_BIT) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+					  ("\tSW RL TF: %s\n",
+					   ddr3_tip_convert_tune_result
+					   (training_result[SW_READ_LEVELING]
+					    [if_id])));
+		}
+		if (mask_tune_func & RECEIVER_CALIBRATION_MASK_BIT) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+					  ("\tRX CAL: %s\n",
+					   ddr3_tip_convert_tune_result
+					   (training_result[RECEIVER_CALIBRATION]
+					    [if_id])));
+		}
+		if (mask_tune_func & WL_PHASE_CORRECTION_MASK_BIT) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+					  ("\tWL PHASE CORRECT: %s\n",
+					   ddr3_tip_convert_tune_result
+					   (training_result[WL_PHASE_CORRECTION]
+					    [if_id])));
+		}
+		if (mask_tune_func & DQ_VREF_CALIBRATION_MASK_BIT) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+					  ("\tDQ VREF CAL: %s\n",
+					   ddr3_tip_convert_tune_result
+					   (training_result[DQ_VREF_CALIBRATION]
+					    [if_id])));
+		}
+		if (mask_tune_func & DQ_MAPPING_MASK_BIT) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+					  ("\tDQ MAP: %s\n",
+					   ddr3_tip_convert_tune_result
+					   (training_result[DQ_MAPPING]
+					    [if_id])));
+		}
+#endif /* CONFIG_DDR4 */
 	}
 
 	return MV_OK;
@@ -512,6 +577,9 @@ int ddr3_tip_print_stability_log(u32 dev_num)
 {
 	u8 if_id = 0, csindex = 0, bus_id = 0, idx = 0;
 	u32 reg_data;
+#if defined(CONFIG_DDR4)
+	u32 reg_data1;
+#endif /* CONFIG_DDR4 */
 	u32 read_data[MAX_INTERFACE_NUM];
 	unsigned int max_cs = mv_ddr_cs_num_get();
 	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
@@ -524,7 +592,13 @@ int ddr3_tip_print_stability_log(u32 dev_num)
 			printf("CS%d , ", csindex);
 			printf("\n");
 			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, bus_id);
+#if defined(CONFIG_DDR4)
+			printf("DminTx, AreaTx, DminRx, AreaRx, WL_tot, WL_ADLL, WL_PH, RL_Tot, RL_ADLL, RL_PH, RL_Smp, CenTx, CenRx, Vref, DQVref,");
+			for (idx = 0; idx < 11; idx++)
+				printf("DC-Pad%d,", idx);
+#else /* CONFIG_DDR4 */
 			printf("VWTx, VWRx, WL_tot, WL_ADLL, WL_PH, RL_Tot, RL_ADLL, RL_PH, RL_Smp, Cen_tx, Cen_rx, Vref, DQVref,");
+#endif /* CONFIG_DDR4 */
 			printf("\t\t");
 			for (idx = 0; idx < 11; idx++)
 				printf("PBSTx-Pad%d,", idx);
@@ -565,6 +639,40 @@ int ddr3_tip_print_stability_log(u32 dev_num)
 			for (bus_id = 0; bus_id < MAX_BUS_NUM; bus_id++) {
 				printf("\n");
 				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, bus_id);
+#if defined(CONFIG_DDR4)
+				/* DminTx, areaTX */
+				ddr3_tip_bus_read(dev_num, if_id,
+						  ACCESS_TYPE_UNICAST,
+						  bus_id, DDR_PHY_DATA,
+						  RESULT_PHY_REG +
+						  csindex, &reg_data);
+				ddr3_tip_bus_read(dev_num, if_id,
+						  ACCESS_TYPE_UNICAST,
+						  dmin_phy_reg_table
+						  [csindex * 5 + bus_id][0],
+						  DDR_PHY_CONTROL,
+						  dmin_phy_reg_table
+						  [csindex * 5 + bus_id][1],
+						  &reg_data1);
+				printf("%d,%d,", 2 * (reg_data1 & 0xFF),
+				       reg_data);
+				/* DminRx, areaRX */
+				ddr3_tip_bus_read(dev_num, if_id,
+						  ACCESS_TYPE_UNICAST,
+						  bus_id, DDR_PHY_DATA,
+						  RESULT_PHY_REG +
+						  csindex + 4, &reg_data);
+				ddr3_tip_bus_read(dev_num, if_id,
+						  ACCESS_TYPE_UNICAST,
+						  dmin_phy_reg_table
+						  [csindex * 5 + bus_id][0],
+						  DDR_PHY_CONTROL,
+						  dmin_phy_reg_table
+						  [csindex * 5 + bus_id][1],
+						  &reg_data1);
+				printf("%d,%d,", 2 * (reg_data1 >> 8),
+				       reg_data);
+#else /* CONFIG_DDR4 */
 				ddr3_tip_bus_read(dev_num, if_id,
 						  ACCESS_TYPE_UNICAST,
 						  bus_id, DDR_PHY_DATA,
@@ -572,6 +680,7 @@ int ddr3_tip_print_stability_log(u32 dev_num)
 						  csindex, &reg_data);
 				printf("%d,%d,", (reg_data & 0x1f),
 				       ((reg_data & 0x3e0) >> 5));
+#endif /* CONFIG_DDR4 */
 				/* WL */
 				ddr3_tip_bus_read(dev_num, if_id,
 						  ACCESS_TYPE_UNICAST,
@@ -628,6 +737,17 @@ int ddr3_tip_print_stability_log(u32 dev_num)
 				/* DQVref */
 				/* Need to add the Read Function from device */
 				printf("%d,", 0);
+#if defined(CONFIG_DDR4)
+				printf("\t\t");
+				for (idx = 0; idx < 11; idx++) {
+					ddr3_tip_bus_read(dev_num, if_id,
+							  ACCESS_TYPE_UNICAST,
+							  bus_id, DDR_PHY_DATA,
+							  0xd0 + 12 * csindex +
+							  idx, &reg_data);
+					printf("%d,", (reg_data & 0x3f));
+				}
+#endif /* CONFIG_DDR4 */
 				printf("\t\t");
 				for (idx = 0; idx < 11; idx++) {
 					ddr3_tip_bus_read(dev_num, if_id,
diff --git a/drivers/ddr/marvell/a38x/ddr3_init.c b/drivers/ddr/marvell/a38x/ddr3_init.c
index f878b4512b..27eb3ac173 100644
--- a/drivers/ddr/marvell/a38x/ddr3_init.c
+++ b/drivers/ddr/marvell/a38x/ddr3_init.c
@@ -6,7 +6,11 @@
 #include "ddr3_init.h"
 #include "mv_ddr_common.h"
 
+#if defined(CONFIG_DDR4)
+static char *ddr_type = "DDR4";
+#else /* CONFIG_DDR4 */
 static char *ddr_type = "DDR3";
+#endif /* CONFIG_DDR4 */
 
 /*
  * generic_init_controller controls D-unit configuration:
@@ -61,6 +65,13 @@ int ddr3_init(void)
 	mv_ddr_mc_init();
 
 	if (!is_manual_cal_done) {
+#if defined(CONFIG_DDR4)
+		status = mv_ddr4_calibration_adjust(0, 1, 0);
+		if (status != MV_OK) {
+			printf("%s: failed (0x%x)\n", __func__, status);
+			return status;
+		}
+#endif
 	}
 
 
@@ -120,6 +131,19 @@ static int mv_ddr_training_params_set(u8 dev_num)
 	params.g_zpodt_ctrl = TUNE_TRAINING_PARAMS_P_ODT_CTRL;
 	params.g_znodt_ctrl = TUNE_TRAINING_PARAMS_N_ODT_CTRL;
 
+#if defined(CONFIG_DDR4)
+	params.g_zpodt_data = TUNE_TRAINING_PARAMS_P_ODT_DATA_DDR4;
+	params.g_odt_config = TUNE_TRAINING_PARAMS_ODT_CONFIG_DDR4;
+	params.g_rtt_nom = TUNE_TRAINING_PARAMS_RTT_NOM_DDR4;
+	params.g_dic = TUNE_TRAINING_PARAMS_DIC_DDR4;
+	if (cs_num == 1) {
+		params.g_rtt_wr =  TUNE_TRAINING_PARAMS_RTT_WR_1CS;
+		params.g_rtt_park = TUNE_TRAINING_PARAMS_RTT_PARK_1CS;
+	} else {
+		params.g_rtt_wr =  TUNE_TRAINING_PARAMS_RTT_WR_2CS;
+		params.g_rtt_park = TUNE_TRAINING_PARAMS_RTT_PARK_2CS;
+	}
+#else /* CONFIG_DDR4 */
 	params.g_zpodt_data = TUNE_TRAINING_PARAMS_P_ODT_DATA;
 	params.g_dic = TUNE_TRAINING_PARAMS_DIC;
 	params.g_rtt_nom = TUNE_TRAINING_PARAMS_RTT_NOM;
@@ -130,6 +154,7 @@ static int mv_ddr_training_params_set(u8 dev_num)
 		params.g_rtt_wr = TUNE_TRAINING_PARAMS_RTT_WR_2CS;
 		params.g_odt_config = TUNE_TRAINING_PARAMS_ODT_CONFIG_2CS;
 	}
+#endif /* CONFIG_DDR4 */
 
 	if (ck_delay > 0)
 		params.ck_delay = ck_delay;
diff --git a/drivers/ddr/marvell/a38x/ddr3_init.h b/drivers/ddr/marvell/a38x/ddr3_init.h
index 055516b67e..ba9f7881d5 100644
--- a/drivers/ddr/marvell/a38x/ddr3_init.h
+++ b/drivers/ddr/marvell/a38x/ddr3_init.h
@@ -137,6 +137,10 @@ extern u32 dfs_low_freq;
 extern u32 nominal_avs;
 extern u32 extension_avs;
 
+#if defined(CONFIG_DDR4)
+/* if 1, SSTL & POD have same Vref and workaround is required */
+extern u8 vref_calibration_wa;
+#endif /* CONFIG_DDR4 */
 
 /* Prototypes */
 int ddr3_init(void);
@@ -152,6 +156,13 @@ void ddr3_new_tip_ecc_scrub(void);
 int ddr3_tip_reg_write(u32 dev_num, u32 reg_addr, u32 data);
 int ddr3_tip_reg_read(u32 dev_num, u32 reg_addr, u32 *data, u32 reg_mask);
 int ddr3_silicon_get_ddr_target_freq(u32 *ddr_freq);
+#if defined(CONFIG_DDR4)
+int mv_ddr4_mode_regs_init(u8 dev_num);
+int mv_ddr4_sdram_config(u32 dev_num);
+int mv_ddr4_phy_config(u32 dev_num);
+int mv_ddr4_calibration_adjust(u32 dev_num, u8 vref_en, u8 pod_only);
+int mv_ddr4_training_main_flow(u32 dev_num);
+#endif /* CONFIG_DDR4 */
 
 int print_adll(u32 dev_num, u32 adll[MAX_INTERFACE_NUM * MAX_BUS_NUM]);
 int print_ph(u32 dev_num, u32 adll[MAX_INTERFACE_NUM * MAX_BUS_NUM]);
@@ -188,5 +199,8 @@ unsigned int mv_ddr_misl_phy_drv_ctrl_p_get(void);
 unsigned int mv_ddr_misl_phy_drv_ctrl_n_get(void);
 unsigned int mv_ddr_misl_phy_odt_p_get(void);
 unsigned int mv_ddr_misl_phy_odt_n_get(void);
+#if defined(CONFIG_DDR4)
+void refresh(void);
+#endif
 
 #endif /* _DDR3_INIT_H */
diff --git a/drivers/ddr/marvell/a38x/ddr3_logging_def.h b/drivers/ddr/marvell/a38x/ddr3_logging_def.h
index ad9da1cfff..a809213926 100644
--- a/drivers/ddr/marvell/a38x/ddr3_logging_def.h
+++ b/drivers/ddr/marvell/a38x/ddr3_logging_def.h
@@ -73,6 +73,27 @@
 #endif
 #endif
 
+#ifdef CONFIG_DDR4
+#ifdef SILENT_LIB
+#define DEBUG_TAP_TUNING_ENGINE(level, s)
+#define DEBUG_CALIBRATION(level, s)
+#define DEBUG_DDR4_CENTRALIZATION(level, s)
+#define DEBUG_DM_TUNING(level, s)
+#else /* SILENT_LIB */
+#define DEBUG_TAP_TUNING_ENGINE(level, s)	\
+	if (level >= debug_tap_tuning)		\
+		printf s
+#define DEBUG_CALIBRATION(level, s)		\
+	if (level >= debug_calibration)		\
+		printf s
+#define DEBUG_DDR4_CENTRALIZATION(level, s)	\
+	if (level >= debug_ddr4_centralization)	\
+		printf s
+#define DEBUG_DM_TUNING(level, s)		\
+	if (level >= debug_dm_tuning)		\
+		printf s
+#endif /* SILENT_LIB */
+#endif /* CONFIG_DDR4 */
 
 /* Logging defines */
 enum mv_ddr_debug_level {
@@ -94,6 +115,12 @@ enum ddr_lib_debug_block {
 	DEBUG_BLOCK_DEVICE,
 	DEBUG_BLOCK_ACCESS,
 	DEBUG_STAGES_REG_DUMP,
+#if defined(CONFIG_DDR4)
+	DEBUG_TAP_TUNING_ENGINE,
+	DEBUG_BLOCK_CALIBRATION,
+	DEBUG_BLOCK_DDR4_CENTRALIZATION,
+	DEBUG_DM_TUNING,
+#endif /* CONFIG_DDR4 */
 	/* All excluding IP and REG_DUMP, should be enabled separatelly */
 	DEBUG_BLOCK_ALL
 };
diff --git a/drivers/ddr/marvell/a38x/ddr3_training.c b/drivers/ddr/marvell/a38x/ddr3_training.c
index 0ddd5aea75..790b01d031 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training.c
+++ b/drivers/ddr/marvell/a38x/ddr3_training.c
@@ -25,7 +25,11 @@ u32 *dq_map_table = NULL;
 /* in case of ddr4 do not run ddr3_tip_write_additional_odt_setting function - mc odt always 'on'
  * in ddr4 case the terminations are rttWR and rttPARK and the odt must be always 'on' 0x1498 = 0xf
  */
+#if defined(CONFIG_DDR4)
+u32 odt_config = 0;
+#else
 u32 odt_config = 1;
+#endif
 
 u32 nominal_avs;
 u32 extension_avs;
@@ -85,7 +89,11 @@ u32 mask_tune_func = (SET_MEDIUM_FREQ_MASK_BIT |
 		      READ_LEVELING_MASK_BIT |
 		      SET_TARGET_FREQ_MASK_BIT |
 		      WRITE_LEVELING_TF_MASK_BIT |
+#if defined(CONFIG_DDR4)
+		      SW_READ_LEVELING_MASK_BIT |
+#else /* CONFIG_DDR4 */
 		      READ_LEVELING_TF_MASK_BIT |
+#endif /* CONFIG_DDR4 */
 		      CENTRALIZATION_RX_MASK_BIT |
 		      CENTRALIZATION_TX_MASK_BIT);
 
@@ -102,6 +110,10 @@ int adll_calibration(u32 dev_num, enum hws_access_type access_type,
 		     u32 if_id, enum mv_ddr_freq frequency);
 static int ddr3_tip_set_timing(u32 dev_num, enum hws_access_type access_type,
 			       u32 if_id, enum mv_ddr_freq frequency);
+#if defined(CONFIG_DDR4)
+static int ddr4_tip_set_timing(u32 dev_num, enum hws_access_type access_type,
+			       u32 if_id, enum mv_ddr_freq frequency);
+#endif /* CONFIG_DDR4 */
 
 static u8 mem_size_config[MV_DDR_DIE_CAP_LAST] = {
 	0x2,			/* 512Mbit  */
@@ -173,12 +185,24 @@ static struct reg_data odpg_default_value[] = {
 };
 
 /* MR cmd and addr definitions */
+#if defined(CONFIG_DDR4)
+struct mv_ddr_mr_data mr_data[] = {
+	{MRS0_CMD, DDR4_MR0_REG},
+	{MRS1_CMD, DDR4_MR1_REG},
+	{MRS2_CMD, DDR4_MR2_REG},
+	{MRS3_CMD, DDR4_MR3_REG},
+	{MRS4_CMD, DDR4_MR4_REG},
+	{MRS5_CMD, DDR4_MR5_REG},
+	{MRS6_CMD, DDR4_MR6_REG}
+};
+#else
 struct mv_ddr_mr_data mr_data[] = {
 	{MRS0_CMD, MR0_REG},
 	{MRS1_CMD, MR1_REG},
 	{MRS2_CMD, MR2_REG},
 	{MRS3_CMD, MR3_REG}
 };
+#endif
 
 /* inverse pads */
 static int ddr3_tip_pad_inv(void)
@@ -664,6 +688,11 @@ int hws_ddr3_tip_init_controller(u32 dev_num, struct init_cntr_param *init_cntr_
 			      calibration_update_control << 3, 0x3 << 3));
 	}
 
+#if defined(CONFIG_DDR4)
+	/* dev_num, vref_en, pod_only */
+	CHECK_STATUS(mv_ddr4_mode_regs_init(dev_num));
+	CHECK_STATUS(mv_ddr4_sdram_config(dev_num));
+#endif /* CONFIG_DDR4 */
 
 	if (delay_enable != 0) {
 		adll_tap = MEGA / (mv_ddr_freq_get(freq) * 64);
@@ -1325,6 +1354,20 @@ int ddr3_tip_freq_set(u32 dev_num, enum hws_access_type access_type,
 
 		/* disable ODT in case of dll off */
 		if (is_dll_off == 1) {
+#if defined(CONFIG_DDR4)
+			CHECK_STATUS(ddr3_tip_if_read
+				     (dev_num, access_type, PARAM_NOT_CARE,
+				      0x1974, &g_rtt_nom_cs0, MASK_ALL_BITS));
+			CHECK_STATUS(ddr3_tip_if_write
+				     (dev_num, access_type, if_id,
+				      0x1974, 0, (0x7 << 8)));
+			CHECK_STATUS(ddr3_tip_if_read
+				     (dev_num, access_type, PARAM_NOT_CARE,
+				      0x1A74, &g_rtt_nom_cs1, MASK_ALL_BITS));
+			CHECK_STATUS(ddr3_tip_if_write
+				     (dev_num, access_type, if_id,
+				      0x1A74, 0, (0x7 << 8)));
+#else /* CONFIG_DDR4 */
 			CHECK_STATUS(ddr3_tip_if_write
 				     (dev_num, access_type, if_id,
 				      0x1874, 0, 0x244));
@@ -1337,6 +1380,7 @@ int ddr3_tip_freq_set(u32 dev_num, enum hws_access_type access_type,
 			CHECK_STATUS(ddr3_tip_if_write
 				     (dev_num, access_type, if_id,
 				      0x18a4, 0, 0x244));
+#endif /* CONFIG_DDR4 */
 		}
 
 		/* DFS  - Enter Self-Refresh */
@@ -1404,6 +1448,16 @@ int ddr3_tip_freq_set(u32 dev_num, enum hws_access_type access_type,
 
 		/* Restore original RTT values if returning from DLL OFF mode */
 		if (is_dll_off == 1) {
+#if defined(CONFIG_DDR4)
+			CHECK_STATUS(ddr3_tip_if_write
+				     (dev_num, access_type, if_id,
+				      0x1974, g_rtt_nom_cs0, (0x7 << 8)));
+			CHECK_STATUS(ddr3_tip_if_write
+				     (dev_num, access_type, if_id,
+				      0x1A74, g_rtt_nom_cs1, (0x7 << 8)));
+
+			mv_ddr4_mode_regs_init(dev_num);
+#else /* CONFIG_DDR4 */
 			CHECK_STATUS(ddr3_tip_if_write
 				     (dev_num, access_type, if_id, 0x1874,
 				      g_dic | g_rtt_nom, 0x266));
@@ -1416,6 +1470,7 @@ int ddr3_tip_freq_set(u32 dev_num, enum hws_access_type access_type,
 			CHECK_STATUS(ddr3_tip_if_write
 				     (dev_num, access_type, if_id, 0x18a4,
 				      g_dic | g_rtt_nom, 0x266));
+#endif /* CONFIG_DDR4 */
 		}
 
 		/* Reset divider_b assert -> de-assert */
@@ -1669,8 +1724,13 @@ static int ddr3_tip_set_timing(u32 dev_num, enum hws_access_type access_type,
 	t_rtp =	GET_MAX_VALUE(t_ckclk * 4, mv_ddr_speed_bin_timing_get(speed_bin_index,
 							   SPEED_BIN_TRTP));
 	t_mod = GET_MAX_VALUE(t_ckclk * 12, 15000);
+#if defined(CONFIG_DDR4)
+	t_wtr = GET_MAX_VALUE(t_ckclk * 2, mv_ddr_speed_bin_timing_get(speed_bin_index,
+							   SPEED_BIN_TWTR));
+#else /* CONFIG_DDR4 */
 	t_wtr = GET_MAX_VALUE(t_ckclk * 4, mv_ddr_speed_bin_timing_get(speed_bin_index,
 							   SPEED_BIN_TWTR));
+#endif /* CONFIG_DDR4 */
 	t_ras = time_to_nclk(mv_ddr_speed_bin_timing_get(speed_bin_index,
 						    SPEED_BIN_TRAS),
 				    t_ckclk);
@@ -1758,10 +1818,70 @@ static int ddr3_tip_set_timing(u32 dev_num, enum hws_access_type access_type,
 				       DDR_TIMING_TPD_MASK << DDR_TIMING_TPD_OFFS |
 				       DDR_TIMING_TXPDLL_MASK << DDR_TIMING_TXPDLL_OFFS));
 
+#if defined(CONFIG_DDR4)
+	ddr4_tip_set_timing(dev_num, access_type, if_id, frequency);
+#endif /* CONFIG_DDR4 */
 
 	return MV_OK;
 }
 
+#if defined(CONFIG_DDR4)
+static int ddr4_tip_set_timing(u32 dev_num, enum hws_access_type access_type,
+			       u32 if_id, enum mv_ddr_freq frequency)
+{
+	u32 t_rrd_l = 0, t_wtr_l = 0, t_ckclk = 0, t_mod = 0, t_ccd = 0;
+	u32 page_size = 0, val = 0, mask = 0;
+	enum mv_ddr_speed_bin speed_bin_index;
+	enum mv_ddr_die_capacity memory_size;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	u32 freq = mv_ddr_freq_get(frequency);
+
+	speed_bin_index = tm->interface_params[if_id].speed_bin_index;
+	memory_size = tm->interface_params[if_id].memory_size;
+	page_size = mv_ddr_page_size_get(tm->interface_params[if_id].bus_width, memory_size);
+
+	t_ckclk = (MEGA / freq);
+
+	t_rrd_l = (page_size == 1) ? mv_ddr_speed_bin_timing_get(speed_bin_index, SPEED_BIN_TRRDL1K) :
+			mv_ddr_speed_bin_timing_get(speed_bin_index, SPEED_BIN_TRRDL2K);
+	t_rrd_l = GET_MAX_VALUE(t_ckclk * 4, t_rrd_l);
+
+	t_wtr_l = mv_ddr_speed_bin_timing_get(speed_bin_index, SPEED_BIN_TWTRL);
+	t_wtr_l = GET_MAX_VALUE(t_ckclk * 4, t_wtr_l);
+
+	t_rrd_l = time_to_nclk(t_rrd_l, t_ckclk);
+	t_wtr_l = time_to_nclk(t_wtr_l, t_ckclk);
+
+	val = (((t_rrd_l - 1) & DDR4_TRRD_L_MASK) << DDR4_TRRD_L_OFFS) |
+	      (((t_wtr_l - 1) & DDR4_TWTR_L_MASK) << DDR4_TWTR_L_OFFS);
+	mask = (DDR4_TRRD_L_MASK << DDR4_TRRD_L_OFFS) |
+	       (DDR4_TWTR_L_MASK << DDR4_TWTR_L_OFFS);
+	CHECK_STATUS(ddr3_tip_if_write(dev_num, access_type, if_id,
+				       DRAM_LONG_TIMING_REG, val, mask));
+
+	val = 0;
+	mask = 0;
+	t_mod = mv_ddr_speed_bin_timing_get(speed_bin_index, SPEED_BIN_TMOD);
+	t_mod = GET_MAX_VALUE(t_ckclk * 24, t_mod);
+	t_mod = time_to_nclk(t_mod, t_ckclk);
+
+	val = (((t_mod - 1) & SDRAM_TIMING_HIGH_TMOD_MASK) << SDRAM_TIMING_HIGH_TMOD_OFFS) |
+	      ((((t_mod - 1) >> 4) & SDRAM_TIMING_HIGH_TMOD_HIGH_MASK) << SDRAM_TIMING_HIGH_TMOD_HIGH_OFFS);
+	mask = (SDRAM_TIMING_HIGH_TMOD_MASK << SDRAM_TIMING_HIGH_TMOD_OFFS) |
+	       (SDRAM_TIMING_HIGH_TMOD_HIGH_MASK << SDRAM_TIMING_HIGH_TMOD_HIGH_OFFS);
+	CHECK_STATUS(ddr3_tip_if_write(dev_num, access_type, if_id,
+				       SDRAM_TIMING_HIGH_REG, val, mask));
+
+	t_ccd = 6;
+
+	CHECK_STATUS(ddr3_tip_if_write(dev_num, access_type, if_id,
+				       DDR_TIMING_REG,
+				       ((t_ccd - 1) & DDR_TIMING_TCCD_MASK) << DDR_TIMING_TCCD_OFFS,
+				       DDR_TIMING_TCCD_MASK << DDR_TIMING_TCCD_OFFS));
+
+	return MV_OK;
+}
+#endif /* CONFIG_DDR4 */
 
 /*
  * Write CS Result
@@ -2245,6 +2365,7 @@ static int ddr3_tip_ddr3_training_main_flow(u32 dev_num)
 		}
 	}
 
+#if !defined(CONFIG_DDR4)
 	for (effective_cs = 0; effective_cs < max_cs; effective_cs++) {
 		if (mask_tune_func & PBS_RX_MASK_BIT) {
 			training_stage = PBS_RX;
@@ -2284,6 +2405,7 @@ static int ddr3_tip_ddr3_training_main_flow(u32 dev_num)
 	}
 	/* Set to 0 after each loop to avoid illegal value may be used */
 	effective_cs = 0;
+#endif /* CONFIG_DDR4 */
 
 	if (mask_tune_func & SET_TARGET_FREQ_MASK_BIT) {
 		training_stage = SET_TARGET_FREQ;
@@ -2367,6 +2489,7 @@ static int ddr3_tip_ddr3_training_main_flow(u32 dev_num)
 		}
 	}
 
+#if !defined(CONFIG_DDR4)
 	if (mask_tune_func & DM_PBS_TX_MASK_BIT) {
 		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("DM_PBS_TX_MASK_BIT\n"));
 	}
@@ -2412,6 +2535,7 @@ static int ddr3_tip_ddr3_training_main_flow(u32 dev_num)
 	}
 	/* Set to 0 after each loop to avoid illegal value may be used */
 	effective_cs = 0;
+#endif /* CONFIG_DDR4 */
 
 	for (effective_cs = 0; effective_cs < max_cs; effective_cs++) {
 		if (mask_tune_func & WRITE_LEVELING_SUPP_TF_MASK_BIT) {
@@ -2434,7 +2558,12 @@ static int ddr3_tip_ddr3_training_main_flow(u32 dev_num)
 	/* Set to 0 after each loop to avoid illegal value may be used */
 	effective_cs = 0;
 
+#if defined(CONFIG_DDR4)
+	for (effective_cs = 0; effective_cs < max_cs; effective_cs++)
+		CHECK_STATUS(mv_ddr4_training_main_flow(dev_num));
+#endif /* CONFIG_DDR4 */
 
+#if !defined(CONFIG_DDR4)
 	for (effective_cs = 0; effective_cs < max_cs; effective_cs++) {
 		if (mask_tune_func & CENTRALIZATION_TX_MASK_BIT) {
 			training_stage = CENTRALIZATION_TX;
@@ -2455,6 +2584,7 @@ static int ddr3_tip_ddr3_training_main_flow(u32 dev_num)
 	}
 	/* Set to 0 after each loop to avoid illegal value may be used */
 	effective_cs = 0;
+#endif /* CONFIG_DDR4 */
 
 	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("restore registers to default\n"));
 	/* restore register values */
@@ -2895,3 +3025,4 @@ unsigned int mv_ddr_misl_phy_odt_n_get(void)
 
 	return odt_n;
 }
+
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_bist.c b/drivers/ddr/marvell/a38x/ddr3_training_bist.c
index d388a17291..3f072eb037 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_bist.c
+++ b/drivers/ddr/marvell/a38x/ddr3_training_bist.c
@@ -459,7 +459,11 @@ static int mv_ddr_odpg_bist_prepare(enum hws_pattern pattern, enum hws_access_ty
 					       (ODPG_WRBUF_RD_CTRL_DIS << ODPG_WRBUF_RD_CTRL_OFFS),
 			  (ODPG_WRBUF_RD_CTRL_MASK << ODPG_WRBUF_RD_CTRL_OFFS));
 
+#if defined(CONFIG_DDR4)
+	if (pattern == PATTERN_ZERO || pattern == PATTERN_ONE)
+#else
 	if (pattern == PATTERN_00 || pattern == PATTERN_FF)
+#endif
 		ddr3_tip_load_pattern_to_odpg(0, access_type, 0, pattern, offset);
 	else
 		mv_ddr_load_dm_pattern_to_odpg(access_type, pattern, dm_dir);
@@ -507,7 +511,11 @@ int mv_ddr_dm_vw_get(enum hws_pattern pattern, u32 cs, u8 *vw_vector)
 	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, ODPG_DATA_CTRL_REG, 0, MASK_ALL_BITS);
 	mv_ddr_odpg_bist_prepare(pattern, ACCESS_TYPE_UNICAST, OPER_WRITE, STRESS_NONE, DURATION_SINGLE,
 				 bist_offset, cs, pattern_table[pattern].num_of_phases_tx,
+#if defined(CONFIG_DDR4)
+				 (pattern == PATTERN_ZERO) ? DM_DIR_DIRECT : DM_DIR_INVERSE);
+#else
 				 (pattern == PATTERN_00) ? DM_DIR_DIRECT : DM_DIR_INVERSE);
+#endif
 
 	for (adll_tap = 0; adll_tap < ADLL_TAPS_PER_PERIOD; adll_tap++) {
 		/* change target odpg address */
@@ -539,7 +547,11 @@ int mv_ddr_dm_vw_get(enum hws_pattern pattern, u32 cs, u8 *vw_vector)
 	/* fill memory with vref pattern to increment addr using odpg bist */
 	mv_ddr_odpg_bist_prepare(PATTERN_VREF, ACCESS_TYPE_UNICAST, OPER_WRITE, STRESS_NONE, DURATION_SINGLE,
 				 bist_offset, cs, pattern_table[pattern].num_of_phases_tx,
+#if defined(CONFIG_DDR4)
+				 (pattern == PATTERN_ZERO) ? DM_DIR_DIRECT : DM_DIR_INVERSE);
+#else
 				 (pattern == PATTERN_00) ? DM_DIR_DIRECT : DM_DIR_INVERSE);
+#endif
 
 	for (adll_tap = 0; adll_tap < ADLL_TAPS_PER_PERIOD; adll_tap++) {
 		ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_MULTICAST, 0,
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_centralization.c b/drivers/ddr/marvell/a38x/ddr3_training_centralization.c
index be9f985f22..e75e45c169 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_centralization.c
+++ b/drivers/ddr/marvell/a38x/ddr3_training_centralization.c
@@ -16,7 +16,11 @@
 #define CENTRAL_RX		1
 #define NUM_OF_CENTRAL_TYPES	2
 
+#if   defined(CONFIG_DDR4) /* DDR4 16/32-bit */
+u32 start_pattern = PATTERN_KILLER_DQ0, end_pattern = PATTERN_KILLER_DQ7_INV;
+#else /* DDR3 16/32-bit */
 u32 start_pattern = PATTERN_KILLER_DQ0, end_pattern = PATTERN_KILLER_DQ7;
+#endif /* CONFIG_64BIT */
 
 u32 start_if = 0, end_if = (MAX_INTERFACE_NUM - 1);
 u8 bus_end_window[NUM_OF_CENTRAL_TYPES][MAX_INTERFACE_NUM][MAX_BUS_NUM];
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_db.c b/drivers/ddr/marvell/a38x/ddr3_training_db.c
index 6aa7b6069e..47ba911e0d 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_db.c
+++ b/drivers/ddr/marvell/a38x/ddr3_training_db.c
@@ -25,6 +25,98 @@ static inline u32 pattern_table_get_sso_xtalk_free_word16(u8 bit, u8 index);
 static inline u32 pattern_table_get_isi_word(u8 index);
 static inline u32 pattern_table_get_isi_word16(u8 index);
 
+#if defined(CONFIG_DDR4)
+u8 pattern_killer_map[KILLER_PATTERN_LENGTH * 2] = {
+	0x01,
+	0x00,
+	0x01,
+	0xff,
+	0xfe,
+	0xfe,
+	0x01,
+	0xfe,
+	0x01,
+	0xfe,
+	0x01,
+	0x01,
+	0xfe,
+	0x01,
+	0xfe,
+	0x00,
+	0xff,
+	0x00,
+	0xff,
+	0x00,
+	0xff,
+	0x00,
+	0xff,
+	0x01,
+	0x00,
+	0xff,
+	0x00,
+	0xff,
+	0x00,
+	0x00,
+	0x00,
+	0xfe,
+	0xfe,
+	0xff,
+	0x00,
+	0x00,
+	0xff,
+	0xff,
+	0x00,
+	0xff,
+	0x00,
+	0xff,
+	0xff,
+	0x00,
+	0x00,
+	0xff,
+	0x00,
+	0xff,
+	0xfe,
+	0x00,
+	0xfe,
+	0xfe,
+	0x00,
+	0xff,
+	0xff,
+	0x01,
+	0x01,
+	0xff,
+	0xff,
+	0x00,
+	0x00,
+	0x00,
+	0x00,
+	0xff
+};
+static inline u32 pattern_table_get_killer_word_4(u8 dqs, u8 index)
+{
+	u8 byte;
+
+	if (index >= (KILLER_PATTERN_LENGTH * 2)) {
+		printf("error: %s: invalid index [%u] found\n", __func__, index);
+		return 0;
+	}
+
+	byte = pattern_killer_map[index];
+
+	switch (byte) {
+	case 0x01:
+	    byte = 1 << dqs;
+	    break;
+	case 0xfe:
+	    byte = 0xff & ~(1 << dqs);
+	    break;
+	default:
+	    break;
+	}
+
+	return byte | (byte << 8) | (byte << 16) | (byte << 24);
+}
+#else /* !CONFIG_DDR4 */
 /* List of allowed frequency listed in order of enum mv_ddr_freq */
 static unsigned int freq_val[MV_DDR_FREQ_LAST] = {
 	0,			/*MV_DDR_FREQ_LOW_FREQ */
@@ -302,6 +394,7 @@ u32 speed_bin_table_t_rcd_t_rp[] = {
 	12155,
 	13090,
 };
+#endif /* CONFIG_DDR4 */
 
 enum {
 	PATTERN_KILLER_PATTERN_TABLE_MAP_ROLE_AGGRESSOR = 0,
@@ -388,6 +481,7 @@ static u8 pattern_vref_pattern_table_map[] = {
 	0xfe
 };
 
+#if !defined(CONFIG_DDR4)
 static struct mv_ddr_page_element page_tbl[] = {
 	/* 8-bit, 16-bit page size */
 	{MV_DDR_PAGE_SIZE_1K, MV_DDR_PAGE_SIZE_2K}, /* 512M */
@@ -521,6 +615,7 @@ static inline u32 pattern_table_get_killer_word(u8 dqs, u8 index)
 
 	return byte | (byte << 8) | (byte << 16) | (byte << 24);
 }
+#endif /* !CONFIG_DDR4 */
 
 static inline u32 pattern_table_get_killer_word16(u8 dqs, u8 index)
 {
@@ -651,6 +746,7 @@ static inline u32 pattern_table_get_vref_word16(u8 index)
 		return 0xffffffff;
 }
 
+#if !defined(CONFIG_DDR4)
 static inline u32 pattern_table_get_static_pbs_word(u8 index)
 {
 	u16 temp;
@@ -659,6 +755,7 @@ static inline u32 pattern_table_get_static_pbs_word(u8 index)
 
 	return temp | (temp << 8) | (temp << 16) | (temp << 24);
 }
+#endif /* !CONFIG_DDR4 */
 
 u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 {
@@ -670,26 +767,36 @@ u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 		switch (type) {
 		case PATTERN_PBS1:
 		case PATTERN_PBS2:
+#if !defined(CONFIG_DDR4)
 			if (index == 0 || index == 2 || index == 5 ||
 			    index == 7)
 				pattern = PATTERN_55;
 			else
 				pattern = PATTERN_AA;
 			break;
+#endif /* !CONFIG_DDR4 */
 		case PATTERN_PBS3:
+#if !defined(CONFIG_DDR4)
 			if (0 == (index & 1))
 				pattern = PATTERN_55;
 			else
 				pattern = PATTERN_AA;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_RL:
+#if !defined(CONFIG_DDR4)
 			if (index < 6)
 				pattern = PATTERN_00;
 			else
 				pattern = PATTERN_80;
+#else /* CONFIG_DDR4 */
+			pattern = PATTERN_00;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_STATIC_PBS:
+#if !defined(CONFIG_DDR4)
 			pattern = pattern_table_get_static_pbs_word(index);
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_KILLER_DQ0:
 		case PATTERN_KILLER_DQ1:
@@ -699,14 +806,22 @@ u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 		case PATTERN_KILLER_DQ5:
 		case PATTERN_KILLER_DQ6:
 		case PATTERN_KILLER_DQ7:
+#if !defined(CONFIG_DDR4)
 			pattern = pattern_table_get_killer_word(
+#else /* CONFIG_DDR4 */
+			pattern = pattern_table_get_killer_word_4(
+#endif /* !CONFIG_DDR4 */
 				(u8)(type - PATTERN_KILLER_DQ0), index);
 			break;
 		case PATTERN_RL2:
+#if !defined(CONFIG_DDR4)
 			if (index < 6)
 				pattern = PATTERN_00;
 			else
 				pattern = PATTERN_01;
+#else /* !CONFIG_DDR4 */
+			pattern = PATTERN_FF;
+#endif /* CONFIG_DDR4 */
 			break;
 		case PATTERN_TEST:
 			if (index > 1 && index < 6)
@@ -749,6 +864,46 @@ u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 		case PATTERN_ISI_XTALK_FREE:
 			pattern = pattern_table_get_isi_word(index);
 			break;
+#if defined(CONFIG_DDR4)
+		case PATTERN_KILLER_DQ0_INV:
+		case PATTERN_KILLER_DQ1_INV:
+		case PATTERN_KILLER_DQ2_INV:
+		case PATTERN_KILLER_DQ3_INV:
+		case PATTERN_KILLER_DQ4_INV:
+		case PATTERN_KILLER_DQ5_INV:
+		case PATTERN_KILLER_DQ6_INV:
+		case PATTERN_KILLER_DQ7_INV:
+			pattern = ~pattern_table_get_killer_word_4(
+				(u8)(type - PATTERN_KILLER_DQ0_INV), index);
+			break;
+		case PATTERN_RESONANCE_1T:
+		case PATTERN_RESONANCE_2T:
+		case PATTERN_RESONANCE_3T:
+		case PATTERN_RESONANCE_4T:
+		case PATTERN_RESONANCE_5T:
+		case PATTERN_RESONANCE_6T:
+		case PATTERN_RESONANCE_7T:
+		case PATTERN_RESONANCE_8T:
+		case PATTERN_RESONANCE_9T:
+			{
+				u8 t_num = (u8)(type - PATTERN_RESONANCE_1T);
+				u8 t_end = (59 / t_num) * t_num;
+				if (index < t_end)
+					pattern = ((index % (t_num * 2)) >= t_num) ? 0xffffffff : 0x00000000;
+				else
+					pattern = ((index % 2) == 0) ? 0xffffffff : 0x00000000;
+			}
+			break;
+		case PATTERN_ZERO:
+			pattern = PATTERN_00;
+			break;
+		case PATTERN_ONE:
+			pattern = PATTERN_FF;
+			break;
+		case PATTERN_VREF_INV:
+			pattern = ~pattern_table_get_vref_word(index);
+			break;
+#endif /* CONFIG_DDR4 */
 		default:
 			printf("error: %s: unsupported pattern type [%d] found\n",
 			       __func__, (int)type);
@@ -761,16 +916,24 @@ u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 		case PATTERN_PBS1:
 		case PATTERN_PBS2:
 		case PATTERN_PBS3:
+#if !defined(CONFIG_DDR4)
 			pattern = PATTERN_55AA;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_RL:
+#if !defined(CONFIG_DDR4)
 			if (index < 3)
 				pattern = PATTERN_00;
 			else
 				pattern = PATTERN_80;
+#else /* CONFIG_DDR4 */
+			pattern = PATTERN_00;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_STATIC_PBS:
+#if !defined(CONFIG_DDR4)
 			pattern = PATTERN_00FF;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_KILLER_DQ0:
 		case PATTERN_KILLER_DQ1:
@@ -784,25 +947,40 @@ u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 				(u8)(type - PATTERN_KILLER_DQ0), index);
 			break;
 		case PATTERN_RL2:
+#if !defined(CONFIG_DDR4)
 			if (index < 3)
 				pattern = PATTERN_00;
 			else
 				pattern = PATTERN_01;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_TEST:
+#if !defined(CONFIG_DDR4)
 			if ((index == 0) || (index == 3))
 				pattern = 0x00000000;
 			else
 				pattern = 0xFFFFFFFF;
+#else /* CONFIG_DDR4 */
+			if ((index > 1) && (index < 6))
+				pattern = PATTERN_20;
+			else
+				pattern = PATTERN_00;
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_FULL_SSO0:
+#if !defined(CONFIG_DDR4)
 			pattern = 0x0000ffff;
 			break;
+#endif /* !CONFIG_DDR4 */
 		case PATTERN_FULL_SSO1:
 		case PATTERN_FULL_SSO2:
 		case PATTERN_FULL_SSO3:
 			pattern = pattern_table_get_sso_word(
+#if !defined(CONFIG_DDR4)
 				(u8)(type - PATTERN_FULL_SSO1), index);
+#else /* CONFIG_DDR4 */
+				(u8)(type - PATTERN_FULL_SSO0), index);
+#endif /* !CONFIG_DDR4 */
 			break;
 		case PATTERN_VREF:
 			pattern = pattern_table_get_vref_word16(index);
@@ -832,6 +1010,40 @@ u32 pattern_table_get_word(u32 dev_num, enum hws_pattern type, u8 index)
 		case PATTERN_ISI_XTALK_FREE:
 			pattern = pattern_table_get_isi_word16(index);
 			break;
+#if defined(CONFIG_DDR4)
+		case PATTERN_KILLER_DQ0_INV:
+		case PATTERN_KILLER_DQ1_INV:
+		case PATTERN_KILLER_DQ2_INV:
+		case PATTERN_KILLER_DQ3_INV:
+		case PATTERN_KILLER_DQ4_INV:
+		case PATTERN_KILLER_DQ5_INV:
+		case PATTERN_KILLER_DQ6_INV:
+		case PATTERN_KILLER_DQ7_INV:
+			pattern = ~pattern_table_get_killer_word16(
+				(u8)(type - PATTERN_KILLER_DQ0_INV), index);
+			break;
+		case PATTERN_RESONANCE_1T:
+		case PATTERN_RESONANCE_2T:
+		case PATTERN_RESONANCE_3T:
+		case PATTERN_RESONANCE_4T:
+		case PATTERN_RESONANCE_5T:
+		case PATTERN_RESONANCE_6T:
+		case PATTERN_RESONANCE_7T:
+		case PATTERN_RESONANCE_8T:
+		case PATTERN_RESONANCE_9T:
+			{
+				u8 t_num = (u8)(type - PATTERN_RESONANCE_1T);
+				u8 t_end = (59 / t_num) * t_num;
+				if (index < t_end)
+					pattern = ((index % (t_num * 2)) >= t_num) ? 0xffffffff : 0x00000000;
+				else
+					pattern = ((index % 2) == 0) ? 0xffffffff : 0x00000000;
+			}
+			break;
+		case PATTERN_VREF_INV:
+			pattern = ~pattern_table_get_vref_word16(index);
+			break;
+#endif /* CONFIG_DDR4 */
 		default:
 			if (((int)type == 29) || ((int)type == 30))
 				break;
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_ip.h b/drivers/ddr/marvell/a38x/ddr3_training_ip.h
index 056c21497c..37d21f2b2b 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_ip.h
+++ b/drivers/ddr/marvell/a38x/ddr3_training_ip.h
@@ -42,6 +42,13 @@
 #define WRITE_LEVELING_LF_MASK_BIT	0x02000000
 
 /* DDR4 Specific Training Mask bits */
+#if defined (CONFIG_DDR4)
+#define RECEIVER_CALIBRATION_MASK_BIT	0x04000000
+#define WL_PHASE_CORRECTION_MASK_BIT	0x08000000
+#define DQ_VREF_CALIBRATION_MASK_BIT	0x10000000
+#define DQ_MAPPING_MASK_BIT		0x20000000
+#define DM_TUNING_MASK_BIT		0x40000000
+#endif /* CONFIG_DDR4 */
 
 enum hws_result {
 	TEST_FAILED = 0,
@@ -63,6 +70,9 @@ enum auto_tune_stage {
 	WRITE_LEVELING,
 	LOAD_PATTERN_2,
 	READ_LEVELING,
+#if defined(CONFIG_DDR4)
+	SW_READ_LEVELING,
+#endif /* CONFIG_DDR4 */
 	WRITE_LEVELING_SUPP,
 	PBS_RX,
 	PBS_TX,
@@ -78,6 +88,13 @@ enum auto_tune_stage {
 	TX_EMPHASIS,
 	LOAD_PATTERN_HIGH,
 	PER_BIT_READ_LEVELING_TF,
+#if defined(CONFIG_DDR4)
+	RECEIVER_CALIBRATION,
+	WL_PHASE_CORRECTION,
+	DQ_VREF_CALIBRATION,
+	DM_TUNING,
+	DQ_MAPPING,
+#endif /* CONFIG_DDR4 */
 	WRITE_LEVELING_LF,
 	MAX_STAGE_LIMIT
 };
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_ip_db.h b/drivers/ddr/marvell/a38x/ddr3_training_ip_db.h
index e28b7ecee1..7b24c1f1a8 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_ip_db.h
+++ b/drivers/ddr/marvell/a38x/ddr3_training_ip_db.h
@@ -7,6 +7,66 @@
 #define _DDR3_TRAINING_IP_DB_H_
 
 enum hws_pattern {
+#if   defined(CONFIG_DDR4) /* DDR4 16/32-bit */
+	PATTERN_PBS1,/*0*/
+	PATTERN_PBS2,
+	PATTERN_PBS3,
+	PATTERN_TEST,
+	PATTERN_RL,
+	PATTERN_RL2,
+	PATTERN_STATIC_PBS,
+	PATTERN_KILLER_DQ0,
+	PATTERN_KILLER_DQ1,
+	PATTERN_KILLER_DQ2,
+	PATTERN_KILLER_DQ3,/*10*/
+	PATTERN_KILLER_DQ4,
+	PATTERN_KILLER_DQ5,
+	PATTERN_KILLER_DQ6,
+	PATTERN_KILLER_DQ7,
+	PATTERN_KILLER_DQ0_INV,
+	PATTERN_KILLER_DQ1_INV,
+	PATTERN_KILLER_DQ2_INV,
+	PATTERN_KILLER_DQ3_INV,
+	PATTERN_KILLER_DQ4_INV,
+	PATTERN_KILLER_DQ5_INV,/*20*/
+	PATTERN_KILLER_DQ6_INV,
+	PATTERN_KILLER_DQ7_INV,
+	PATTERN_VREF,
+	PATTERN_VREF_INV,
+	PATTERN_FULL_SSO0,
+	PATTERN_FULL_SSO1,
+	PATTERN_FULL_SSO2,
+	PATTERN_FULL_SSO3,
+	PATTERN_ZERO,
+	PATTERN_ONE,
+	PATTERN_LAST,
+	PATTERN_SSO_FULL_XTALK_DQ0,
+	PATTERN_SSO_FULL_XTALK_DQ1,/*30*/
+	PATTERN_SSO_FULL_XTALK_DQ2,
+	PATTERN_SSO_FULL_XTALK_DQ3,
+	PATTERN_SSO_FULL_XTALK_DQ4,
+	PATTERN_SSO_FULL_XTALK_DQ5,
+	PATTERN_SSO_FULL_XTALK_DQ6,
+	PATTERN_SSO_FULL_XTALK_DQ7,
+	PATTERN_SSO_XTALK_FREE_DQ0,
+	PATTERN_SSO_XTALK_FREE_DQ1,
+	PATTERN_SSO_XTALK_FREE_DQ2,
+	PATTERN_SSO_XTALK_FREE_DQ3,/*40*/
+	PATTERN_SSO_XTALK_FREE_DQ4,
+	PATTERN_SSO_XTALK_FREE_DQ5,
+	PATTERN_SSO_XTALK_FREE_DQ6,
+	PATTERN_SSO_XTALK_FREE_DQ7,
+	PATTERN_ISI_XTALK_FREE,
+	PATTERN_RESONANCE_1T,
+	PATTERN_RESONANCE_2T,
+	PATTERN_RESONANCE_3T,
+	PATTERN_RESONANCE_4T,
+	PATTERN_RESONANCE_5T,/*50*/
+	PATTERN_RESONANCE_6T,
+	PATTERN_RESONANCE_7T,
+	PATTERN_RESONANCE_8T,
+	PATTERN_RESONANCE_9T
+#else /* DDR3 16/32-bit */
 	PATTERN_PBS1,
 	PATTERN_PBS2,
 	PATTERN_PBS3,
@@ -45,6 +105,7 @@ enum hws_pattern {
 	PATTERN_SSO_XTALK_FREE_DQ6,
 	PATTERN_SSO_XTALK_FREE_DQ7,
 	PATTERN_ISI_XTALK_FREE
+#endif /* CONFIG_64BIT */
 };
 
 enum mv_wl_supp_mode {
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_ip_engine.c b/drivers/ddr/marvell/a38x/ddr3_training_ip_engine.c
index 102f9bd633..2567dc3b3f 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_ip_engine.c
+++ b/drivers/ddr/marvell/a38x/ddr3_training_ip_engine.c
@@ -205,6 +205,137 @@ struct pattern_info pattern_table_64[] = {
 	/* Note: actual start_address is "<< 3" of defined address */
 };
 
+#if defined(CONFIG_DDR4)
+struct pattern_info pattern_table_16[] = {
+	/*
+	 * num tx phases, tx burst, delay between, rx pattern,
+	 * start_address, pattern_len
+	 */
+	{0x1, 0x1, 2, 0x1, 0x0000, 2},	/* PATTERN_PBS1*/
+	{0x1, 0x1, 2, 0x1, 0x0080, 2},	/* PATTERN_PBS2*/
+	{0x1, 0x1, 2, 0x1, 0x0100, 2},	/* PATTERN_PBS3*/
+	{0x1, 0x1, 2, 0x1, 0x0180, 2},	/* PATTERN_TEST*/
+	{0x1, 0x1, 2, 0x1, 0x0200, 2},	/* PATTERN_RL*/
+	{0x1, 0x1, 2, 0x1, 0x0280, 2},	/* PATTERN_RL2*/
+	{0xf, 0x7, 2, 0x7, 0x0680, 16},	/* PATTERN_STATIC_PBS*/
+	{0xf, 0x7, 2, 0x7, 0x0A80, 16},	/* PATTERN_KILLER_DQ0*/
+	{0xf, 0x7, 2, 0x7, 0x0E80, 16},	/* PATTERN_KILLER_DQ1*/
+	{0xf, 0x7, 2, 0x7, 0x1280, 16},	/* PATTERN_KILLER_DQ2*/
+	{0xf, 0x7, 2, 0x7, 0x1680, 16},	/* PATTERN_KILLER_DQ3*/
+	{0xf, 0x7, 2, 0x7, 0x1A80, 16},	/* PATTERN_KILLER_DQ4*/
+	{0xf, 0x7, 2, 0x7, 0x1E80, 16},	/* PATTERN_KILLER_DQ5*/
+	{0xf, 0x7, 2, 0x7, 0x2280, 16},	/* PATTERN_KILLER_DQ6*/
+	{0xf, 0x7, 2, 0x7, 0x2680, 16},	/* PATTERN_KILLER_DQ7*/
+	{0xf, 0x7, 2, 0x7, 0x2A80, 16},	/* PATTERN_KILLER_DQ0_INV*/
+	{0xf, 0x7, 2, 0x7, 0x2E80, 16},	/* PATTERN_KILLER_DQ1_INV*/
+	{0xf, 0x7, 2, 0x7, 0x3280, 16},	/* PATTERN_KILLER_DQ2_INV*/
+	{0xf, 0x7, 2, 0x7, 0x3680, 16},	/* PATTERN_KILLER_DQ3_INV*/
+	{0xf, 0x7, 2, 0x7, 0x3A80, 16},	/* PATTERN_KILLER_DQ4_INV*/
+	{0xf, 0x7, 2, 0x7, 0x3E80, 16},	/* PATTERN_KILLER_DQ5_INV*/
+	{0xf, 0x7, 2, 0x7, 0x4280, 16},	/* PATTERN_KILLER_DQ6_INV*/
+	{0xf, 0x7, 2, 0x7, 0x4680, 16},	/* PATTERN_KILLER_DQ7_INV*/
+	{0xf, 0x7, 2, 0x7, 0x4A80, 16},	/* PATTERN_VREF*/
+	{0xf, 0x7, 2, 0x7, 0x4E80, 16},	/* PATTERN_VREF_INV*/
+	{0xf, 0x7, 2, 0x7, 0x5280, 16},	/* PATTERN_FULL_SSO_0T*/
+	{0xf, 0x7, 2, 0x7, 0x5680, 16},	/* PATTERN_FULL_SSO_1T*/
+	{0xf, 0x7, 2, 0x7, 0x5A80, 16},	/* PATTERN_FULL_SSO_2T*/
+	{0xf, 0x7, 2, 0x7, 0x5E80, 16},	/* PATTERN_FULL_SSO_3T*/
+	{0xf, 0x7, 2, 0x7, 0x6280, 16},	/* PATTERN_ZERO */
+	{0xf, 0x7, 2, 0x7, 0x6680, 16},	/* PATTERN_ONE */
+	{0xf, 0x7, 2, 0x7, 0x6A80, 16},	/* PATTERN_SSO_FULL_XTALK_DQ0*/
+	{0xf, 0x7, 2, 0x7, 0x6E80, 16},	/* PATTERN_SSO_FULL_XTALK_DQ1*/
+	{0xf, 0x7, 2, 0x7, 0x7280, 16},	/* PATTERN_SSO_FULL_XTALK_DQ2*/
+	{0xf, 0x7, 2, 0x7, 0x7680, 16},	/* PATTERN_SSO_FULL_XTALK_DQ3*/
+	{0xf, 0x7, 2, 0x7, 0x7A80, 16},	/* PATTERN_SSO_FULL_XTALK_DQ4*/
+	{0xf, 0x7, 2, 0x7, 0x7E80, 16},	/* PATTERN_SSO_FULL_XTALK_DQ5*/
+	{0xf, 0x7, 2, 0x7, 0x8280, 16},	/* PATTERN_SSO_FULL_XTALK_DQ6*/
+	{0xf, 0x7, 2, 0x7, 0x8680, 16}, /* PATTERN_SSO_FULL_XTALK_DQ7*/
+	{0xf, 0x7, 2, 0x7, 0x8A80, 16},	/* PATTERN_SSO_XTALK_FREE_DQ0*/
+	{0xf, 0x7, 2, 0x7, 0x8E80, 16},	/* PATTERN_SSO_XTALK_FREE_DQ1*/
+	{0xf, 0x7, 2, 0x7, 0x9280, 16},	/* PATTERN_SSO_XTALK_FREE_DQ2*/
+	{0xf, 0x7, 2, 0x7, 0x9680, 16},	/* PATTERN_SSO_XTALK_FREE_DQ3*/
+	{0xf, 0x7, 2, 0x7, 0x9A80, 16},	/* PATTERN_SSO_XTALK_FREE_DQ4*/
+	{0xf, 0x7, 2, 0x7, 0x9E80, 16},	/* PATTERN_SSO_XTALK_FREE_DQ5*/
+	{0xf, 0x7, 2, 0x7, 0xA280, 16},	/* PATTERN_SSO_XTALK_FREE_DQ6*/
+	{0xf, 0x7, 2, 0x7, 0xA680, 16},	/* PATTERN_SSO_XTALK_FREE_DQ7*/
+	{0xf, 0x7, 2, 0x7, 0xAA80, 16},	/* PATTERN_ISI_XTALK_FREE*/
+	{0xf, 0x7, 2, 0x7, 0xAE80, 16},	/* PATTERN_RESONANCE_1T*/
+	{0xf, 0x7, 2, 0x7, 0xB280, 16},	/* PATTERN_RESONANCE_2T*/
+	{0xf, 0x7, 2, 0x7, 0xB680, 16},	/* PATTERN_RESONANCE_3T*/
+	{0xf, 0x7, 2, 0x7, 0xBA80, 16},	/* PATTERN_RESONANCE_4T*/
+	{0xf, 0x7, 2, 0x7, 0xBE80, 16},	/* PATTERN_RESONANCE_5T*/
+	{0xf, 0x7, 2, 0x7, 0xC280, 16},	/* PATTERN_RESONANCE_6T*/
+	{0xf, 0x7, 2, 0x7, 0xC680, 16},	/* PATTERN_RESONANCE_7T*/
+	{0xf, 0x7, 2, 0x7, 0xca80, 16},	/* PATTERN_RESONANCE_8T*/
+	{0xf, 0x7, 2, 0x7, 0xce80, 16}	/* PATTERN_RESONANCE_9T*/
+	/* Note: actual start_address is "<< 3" of defined address */
+};
+
+struct pattern_info pattern_table_32[] = {
+	/*
+	 * num tx phases, tx burst, delay between, rx pattern,
+	 * start_address, pattern_len
+	 */
+	{0x3, 0x3, 2, 0x3, 0x0000, 4},		/* PATTERN_PBS1*/
+	{0x3, 0x3, 2, 0x3, 0x0020, 4},		/* PATTERN_PBS2*/
+	{0x3, 0x3, 2, 0x3, 0x0040, 4},		/* PATTERN_PBS3*/
+	{0x3, 0x3, 2, 0x3, 0x0060, 4},		/* PATTERN_TEST*/
+	{0x3, 0x3, 2, 0x3, 0x0080, 4},		/* PATTERN_RL*/
+	{0x3, 0x3, 2, 0x3, 0x00a0, 4},		/* PATTERN_RL2*/
+	{0x1f, 0xf, 2, 0xf, 0x00c0, 32},	/* PATTERN_STATIC_PBS*/
+	{0x1f, 0xf, 2, 0xf, 0x00e0, 32},	/* PATTERN_KILLER_DQ0*/
+	{0x1f, 0xf, 2, 0xf, 0x0100, 32},	/* PATTERN_KILLER_DQ1*/
+	{0x1f, 0xf, 2, 0xf, 0x0120, 32},	/* PATTERN_KILLER_DQ2*/
+	{0x1f, 0xf, 2, 0xf, 0x0140, 32},	/* PATTERN_KILLER_DQ3*/
+	{0x1f, 0xf, 2, 0xf, 0x0160, 32},	/* PATTERN_KILLER_DQ4*/
+	{0x1f, 0xf, 2, 0xf, 0x0180, 32},	/* PATTERN_KILLER_DQ5*/
+	{0x1f, 0xf, 2, 0xf, 0x01a0, 32},	/* PATTERN_KILLER_DQ6*/
+	{0x1f, 0xf, 2, 0xf, 0x01c0, 32},	/* PATTERN_KILLER_DQ7*/
+	{0x1f, 0xf, 2, 0xf, 0x01e0, 32},	/* PATTERN_KILLER_DQ0_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x0200, 32},	/* PATTERN_KILLER_DQ1_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x0220, 32},	/* PATTERN_KILLER_DQ2_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x0240, 32},	/* PATTERN_KILLER_DQ3_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x0260, 32},	/* PATTERN_KILLER_DQ4_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x0280, 32},	/* PATTERN_KILLER_DQ5_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x02a0, 32},	/* PATTERN_KILLER_DQ6_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x02c0, 32},	/* PATTERN_KILLER_DQ7_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x02e0, 32},	/* PATTERN_VREF*/
+	{0x1f, 0xf, 2, 0xf, 0x0300, 32},	/* PATTERN_VREF_INV*/
+	{0x1f, 0xf, 2, 0xf, 0x0320, 32},	/* PATTERN_FULL_SSO_0T*/
+	{0x1f, 0xf, 2, 0xf, 0x0340, 32},	/* PATTERN_FULL_SSO_1T*/
+	{0x1f, 0xf, 2, 0xf, 0x0360, 32},	/* PATTERN_FULL_SSO_2T*/
+	{0x1f, 0xf, 2, 0xf, 0x0380, 32},	/* PATTERN_FULL_SSO_3T*/
+	{0x1f, 0xf, 2, 0xf, 0x6280, 32},	/* PATTERN_ZERO */
+	{0x1f, 0xf, 2, 0xf, 0x6680, 32},	/* PATTERN_ONE */
+	{0x1f, 0xf, 2, 0xf, 0x6A80, 32},	/* PATTERN_SSO_FULL_XTALK_DQ0*/
+	{0x1f, 0xf, 2, 0xf, 0x6E80, 32},	/* PATTERN_SSO_FULL_XTALK_DQ1*/
+	{0x1f, 0xf, 2, 0xf, 0x7280, 32},	/* PATTERN_SSO_FULL_XTALK_DQ2*/
+	{0x1f, 0xf, 2, 0xf, 0x7680, 32},	/* PATTERN_SSO_FULL_XTALK_DQ3*/
+	{0x1f, 0xf, 2, 0xf, 0x7A80, 32},	/* PATTERN_SSO_FULL_XTALK_DQ4*/
+	{0x1f, 0xf, 2, 0xf, 0x7E80, 32},	/* PATTERN_SSO_FULL_XTALK_DQ5*/
+	{0x1f, 0xf, 2, 0xf, 0x8280, 32},	/* PATTERN_SSO_FULL_XTALK_DQ6*/
+	{0x1f, 0xf, 2, 0xf, 0x8680, 32},	/* PATTERN_SSO_FULL_XTALK_DQ7*/
+	{0x1f, 0xf, 2, 0xf, 0x8A80, 32},	/* PATTERN_SSO_XTALK_FREE_DQ0*/
+	{0x1f, 0xf, 2, 0xf, 0x8E80, 32},	/* PATTERN_SSO_XTALK_FREE_DQ1*/
+	{0x1f, 0xf, 2, 0xf, 0x9280, 32},	/* PATTERN_SSO_XTALK_FREE_DQ2*/
+	{0x1f, 0xf, 2, 0xf, 0x9680, 32},	/* PATTERN_SSO_XTALK_FREE_DQ3*/
+	{0x1f, 0xf, 2, 0xf, 0x9A80, 32},	/* PATTERN_SSO_XTALK_FREE_DQ4*/
+	{0x1f, 0xf, 2, 0xf, 0x9E80, 32},	/* PATTERN_SSO_XTALK_FREE_DQ5*/
+	{0x1f, 0xf, 2, 0xf, 0xA280, 32},	/* PATTERN_SSO_XTALK_FREE_DQ6*/
+	{0x1f, 0xf, 2, 0xf, 0xA680, 32},	/* PATTERN_SSO_XTALK_FREE_DQ7*/
+	{0x1f, 0xf, 2, 0xf, 0xAA80, 32},	/* PATTERN_ISI_XTALK_FREE*/
+	{0x1f, 0xf, 2, 0xf, 0xAE80, 32},	/* PATTERN_RESONANCE_1T*/
+	{0x1f, 0xf, 2, 0xf, 0xB280, 32},	/* PATTERN_RESONANCE_2T*/
+	{0x1f, 0xf, 2, 0xf, 0xB680, 32},	/* PATTERN_RESONANCE_3T*/
+	{0x1f, 0xf, 2, 0xf, 0xBA80, 32},	/* PATTERN_RESONANCE_4T*/
+	{0x1f, 0xf, 2, 0xf, 0xBE80, 32},	/* PATTERN_RESONANCE_5T*/
+	{0x1f, 0xf, 2, 0xf, 0xC280, 32},	/* PATTERN_RESONANCE_6T*/
+	{0x1f, 0xf, 2, 0xf, 0xC680, 32},	/* PATTERN_RESONANCE_7T*/
+	{0x1f, 0xf, 2, 0xf, 0xca80, 32},	/* PATTERN_RESONANCE_8T*/
+	{0x1f, 0xf, 2, 0xf, 0xce80, 32}		/* PATTERN_RESONANCE_9T*/
+	/* Note: actual start_address is "<< 3" of defined address */
+};
+#else /* CONFIG_DDR4 */
 struct pattern_info pattern_table_16[] = {
 	/*
 	 * num tx phases, tx burst, delay between, rx pattern,
@@ -294,6 +425,7 @@ struct pattern_info pattern_table_32[] = {
 	{0x1f, 0xF, 2, 0xf, 0xA280, 32}		/* PATTERN_ISI_XTALK_FREE */
 	/* Note: actual start_address is "<< 3" of defined address */
 };
+#endif /* CONFIG_DDR4 */
 
 u32 train_dev_num;
 enum hws_ddr_cs traintrain_cs_type;
@@ -309,7 +441,12 @@ enum hws_pattern train_pattern;
 enum hws_edge_compare train_edge_compare;
 u32 train_cs_num;
 u32 train_if_acess, train_if_id, train_pup_access;
+#if defined(CONFIG_DDR4)
+/* The counter was increased for DDR4 because of A390 DB-GP DDR4 failure */
+u32 max_polling_for_done = 100000000;
+#else /* CONFIG_DDR4 */
 u32 max_polling_for_done = 1000000;
+#endif /* CONFIG_DDR4 */
 
 u32 *ddr3_tip_get_buf_ptr(u32 dev_num, enum hws_search_dir search,
 			  enum hws_training_result result_type,
@@ -561,6 +698,10 @@ int ddr3_tip_ip_training(u32 dev_num, enum hws_access_type access_type,
 
 	ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
 			  ODPG_DATA_CTRL_REG, 0, MASK_ALL_BITS);
+#if defined(CONFIG_DDR4)
+	if (tm->debug_level != DEBUG_LEVEL_ERROR)
+		refresh();
+#endif
 
 	return MV_OK;
 }
@@ -837,6 +978,10 @@ int ddr3_tip_read_training_result(u32 dev_num, u32 if_id,
 			}
 		}
 	}
+#if defined(CONFIG_DDR4)
+	if (tm->debug_level != DEBUG_LEVEL_ERROR)
+		refresh();
+#endif
 
 	return MV_OK;
 }
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_ip_flow.h b/drivers/ddr/marvell/a38x/ddr3_training_ip_flow.h
index 55832a5540..16a59512ee 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_ip_flow.h
+++ b/drivers/ddr/marvell/a38x/ddr3_training_ip_flow.h
@@ -46,6 +46,11 @@ enum mr_number {
 	MR_CMD1,
 	MR_CMD2,
 	MR_CMD3,
+#if defined(CONFIG_DDR4)
+	MR_CMD4,
+	MR_CMD5,
+	MR_CMD6,
+#endif
 	MR_LAST
 };
 
diff --git a/drivers/ddr/marvell/a38x/ddr3_training_leveling.c b/drivers/ddr/marvell/a38x/ddr3_training_leveling.c
index 6523281f2b..55abbad5a7 100644
--- a/drivers/ddr/marvell/a38x/ddr3_training_leveling.c
+++ b/drivers/ddr/marvell/a38x/ddr3_training_leveling.c
@@ -1667,6 +1667,59 @@ enum rl_dqs_burst_state {
 	RL_BEHIND
 };
 
+#if defined(CONFIG_DDR4)
+static int mpr_rd_frmt_config(
+	enum mv_ddr_mpr_ps ps,
+	enum mv_ddr_mpr_op op,
+	enum mv_ddr_mpr_rd_frmt rd_frmt,
+	u8 cs_bitmask, u8 dis_auto_refresh)
+{
+	u32 val, mask;
+	u8 cs_bitmask_inv;
+
+
+	if (dis_auto_refresh == 1) {
+		ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, ODPG_CTRL_CTRL_REG,
+			  ODPG_CTRL_AUTO_REFRESH_DIS << ODPG_CTRL_AUTO_REFRESH_OFFS,
+			  ODPG_CTRL_AUTO_REFRESH_MASK << ODPG_CTRL_AUTO_REFRESH_OFFS);
+	} else {
+		ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, ODPG_CTRL_CTRL_REG,
+			  ODPG_CTRL_AUTO_REFRESH_ENA << ODPG_CTRL_AUTO_REFRESH_OFFS,
+			  ODPG_CTRL_AUTO_REFRESH_MASK << ODPG_CTRL_AUTO_REFRESH_OFFS);
+	}
+
+	/* configure MPR Location for MPR write and read accesses within the selected page */
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, DDR4_MPR_WR_REG,
+			  DDR4_MPR_LOC3 << DDR4_MPR_LOC_OFFS,
+			  DDR4_MPR_LOC_MASK << DDR4_MPR_LOC_OFFS);
+
+	/* configure MPR page selection, operation and read format */
+	val = ps << DDR4_MPR_PS_OFFS |
+	      op << DDR4_MPR_OP_OFFS |
+	      rd_frmt << DDR4_MPR_RF_OFFS;
+	mask = DDR4_MPR_PS_MASK << DDR4_MPR_PS_OFFS |
+	       DDR4_MPR_OP_MASK << DDR4_MPR_OP_OFFS |
+	       DDR4_MPR_RF_MASK << DDR4_MPR_RF_OFFS;
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, DDR4_MR3_REG, val, mask);
+
+	/* prepare cs bitmask in active low format */
+	cs_bitmask_inv = ~cs_bitmask & SDRAM_OP_CMD_ALL_CS_MASK;
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, SDRAM_OP_REG,
+			  CMD_DDR3_DDR4_MR3 << SDRAM_OP_CMD_OFFS |
+			  cs_bitmask_inv << SDRAM_OP_CMD_CS_OFFS(0),
+			  SDRAM_OP_CMD_MASK << SDRAM_OP_CMD_OFFS |
+			  SDRAM_OP_CMD_ALL_CS_MASK << SDRAM_OP_CMD_CS_OFFS(0));
+
+	if (ddr3_tip_if_polling(0, ACCESS_TYPE_UNICAST, 0,
+				CMD_NORMAL, SDRAM_OP_CMD_MASK, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS)) {
+		printf("error: %s failed\n", __func__);
+		return -1;
+	}
+
+	return 0;
+}
+#endif /* CONFIG_DDR4 */
 
 int mv_ddr_rl_dqs_burst(u32 dev_num, u32 if_id, u32 freq)
 {
@@ -1690,6 +1743,19 @@ int mv_ddr_rl_dqs_burst(u32 dev_num, u32 if_id, u32 freq)
 	u32 reg_val, reg_mask;
 	uintptr_t test_addr = TEST_ADDR;
 
+#if defined(CONFIG_DDR4)
+	int status;
+	u8 cs_bitmask = tm->interface_params[0].as_bus_params[0].cs_bitmask;
+	u8 curr_cs_bitmask_inv;
+
+	/* enable MPR for all existing chip-selects */
+	status = mpr_rd_frmt_config(DDR4_MPR_PAGE0,
+				    DDR4_MPR_OP_ENA,
+				    DDR4_MPR_RF_SERIAL,
+				    cs_bitmask, 1);
+	if (status)
+		return status;
+#endif /* CONFIG_DDR4 */
 
 	/* initialization */
 	if (mv_ddr_is_ecc_ena()) {
@@ -1728,6 +1794,48 @@ int mv_ddr_rl_dqs_burst(u32 dev_num, u32 if_id, u32 freq)
 	/* search for dqs edges per subphy */
 	if_id = 0;
 	for (effective_cs = 0; effective_cs < max_cs; effective_cs++) {
+#if defined(CONFIG_DDR4)
+		/* enable read preamble training mode for chip-select under test */
+		ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+				  DDR4_MR4_REG,
+				  DDR4_RPT_ENA << DDR4_RPT_OFFS,
+				  DDR4_RPT_MASK << DDR4_RPT_OFFS);
+		/* prepare current cs bitmask in active low format */
+		curr_cs_bitmask_inv = ~(1 << effective_cs) & SDRAM_OP_CMD_ALL_CS_MASK;
+		reg_val = curr_cs_bitmask_inv << SDRAM_OP_CMD_CS_OFFS(0) |
+			  CMD_DDR4_MR4 << SDRAM_OP_CMD_OFFS;
+		reg_mask = SDRAM_OP_CMD_ALL_CS_MASK << SDRAM_OP_CMD_CS_OFFS(0) |
+			   SDRAM_OP_CMD_MASK << SDRAM_OP_CMD_OFFS;
+		ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+				  SDRAM_OP_REG, reg_val, reg_mask);
+		if (ddr3_tip_if_polling(0, ACCESS_TYPE_UNICAST, 0,
+					CMD_NORMAL, SDRAM_OP_CMD_MASK, SDRAM_OP_REG,
+					MAX_POLLING_ITERATIONS)) {
+			printf("error: %s failed\n", __func__);
+			return -1;
+		}
+
+		/* disable preamble training mode for existing chip-selects not under test */
+		ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+				  DDR4_MR4_REG,
+				  DDR4_RPT_DIS << DDR4_RPT_OFFS,
+				  DDR4_RPT_MASK << DDR4_RPT_OFFS);
+		/* prepare bitmask for existing chip-selects not under test in active low format */
+		reg_val = ((~(curr_cs_bitmask_inv & cs_bitmask) & SDRAM_OP_CMD_ALL_CS_MASK) <<
+			   SDRAM_OP_CMD_CS_OFFS(0)) |
+			  CMD_DDR4_MR4 << SDRAM_OP_CMD_OFFS;
+		reg_mask = SDRAM_OP_CMD_ALL_CS_MASK << SDRAM_OP_CMD_CS_OFFS(0) |
+			   SDRAM_OP_CMD_MASK << SDRAM_OP_CMD_OFFS;
+		ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+				  SDRAM_OP_REG, reg_val, reg_mask);
+		if (ddr3_tip_if_polling(0, ACCESS_TYPE_UNICAST, 0,
+					CMD_NORMAL, SDRAM_OP_CMD_MASK, SDRAM_OP_REG,
+					MAX_POLLING_ITERATIONS)) {
+			printf("error: %s failed\n", __func__);
+			return -1;
+		}
+
+#endif /* CONFIG_DDR4 */
 
 		pass_lock_num = init_pass_lock_num;
 		ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ODPG_DATA_CTRL_REG,
@@ -1948,6 +2056,33 @@ int mv_ddr_rl_dqs_burst(u32 dev_num, u32 if_id, u32 freq)
 			CHECK_STATUS(ddr3_tip_write_additional_odt_setting(dev_num, if_id));
 	}
 
+#if defined(CONFIG_DDR4)
+	/* disable read preamble training mode for all existing chip-selects */
+	ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+			  DDR4_MR4_REG,
+			  DDR4_RPT_DIS << DDR4_RPT_OFFS,
+			  DDR4_RPT_MASK << DDR4_RPT_OFFS);
+	reg_val = (~cs_bitmask & SDRAM_OP_CMD_ALL_CS_MASK) << SDRAM_OP_CMD_CS_OFFS(0) |
+		  CMD_DDR4_MR4 << SDRAM_OP_CMD_OFFS;
+	reg_mask = SDRAM_OP_CMD_ALL_CS_MASK << SDRAM_OP_CMD_CS_OFFS(0) |
+		   SDRAM_OP_CMD_MASK << SDRAM_OP_CMD_OFFS;
+	ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+			  SDRAM_OP_REG, reg_val, reg_mask);
+	if (ddr3_tip_if_polling(0, ACCESS_TYPE_UNICAST, 0,
+				CMD_NORMAL, SDRAM_OP_CMD_MASK, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS)) {
+		printf("error: %s failed\n", __func__);
+		return -1;
+	}
+
+	/* disable MPR for all existing chip-selects */
+	status = mpr_rd_frmt_config(DDR4_MPR_PAGE0,
+				    DDR4_MPR_OP_DIS,
+				    DDR4_MPR_RF_SERIAL,
+				    cs_bitmask, 0);
+	if (status)
+		return status;
+#endif /* CONFIG_DDR4 */
 
 	/* reset read fifo assertion */
 	ddr3_tip_if_write(dev_num, ACCESS_TYPE_MULTICAST, if_id, SDRAM_CFG_REG,
diff --git a/drivers/ddr/marvell/a38x/dram_if.h b/drivers/ddr/marvell/a38x/dram_if.h
deleted file mode 100644
index 4d0846489b..0000000000
--- a/drivers/ddr/marvell/a38x/dram_if.h
+++ /dev/null
@@ -1,13 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Copyright (C) 2016 Marvell International Ltd.
- */
-
-#ifndef _DRAM_IF_H_
-#define _DRAM_IF_H_
-
-/* TODO: update atf to this new prototype */
-int dram_init(void);
-void dram_mmap_config(void);
-unsigned long long dram_iface_mem_sz_get(void);
-#endif /* _DRAM_IF_H_ */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c b/drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
new file mode 100644
index 0000000000..f742b5a9bc
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
@@ -0,0 +1,674 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#if defined(CONFIG_DDR4)
+
+/* DDR4 MPR/PDA Interface */
+#include "ddr3_init.h"
+#include "mv_ddr4_mpr_pda_if.h"
+#include "mv_ddr4_training.h"
+#include "mv_ddr_training_db.h"
+#include "mv_ddr_common.h"
+#include "mv_ddr_regs.h"
+
+static u8 dram_to_mc_dq_map[MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+static int dq_map_enable;
+
+static u32 mv_ddr4_tx_odt_get(void)
+{
+	u16 odt = 0xffff, rtt = 0xffff;
+
+	if (g_odt_config & 0xe0000)
+		rtt =  mv_ddr4_rtt_nom_to_odt(g_rtt_nom);
+	else if (g_odt_config & 0x10000)
+		rtt = mv_ddr4_rtt_wr_to_odt(g_rtt_wr);
+	else
+		return odt;
+
+	return (odt * rtt) / (odt + rtt);
+}
+
+/*
+ * mode registers initialization function
+ * replaces all MR writes in DDR3 init function
+ */
+int mv_ddr4_mode_regs_init(u8 dev_num)
+{
+	int status;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	enum hws_access_type access_type = ACCESS_TYPE_UNICAST;
+	u32 if_id;
+	u32 cl, cwl;
+	u32 val, mask;
+	u32 t_wr, t_ckclk;
+	/* design GL params to be set outside */
+	u32 dic = 0;
+	u32 ron = 30; /* znri */
+	u32 rodt = mv_ddr4_tx_odt_get(); /* effective rtt */
+	/* vref percentage presented as 100 x percentage value (e.g., 6000 = 100 x 60%) */
+	u32 vref = ((ron + rodt / 2) * 10000) / (ron + rodt);
+	u32 range = (vref >= 6000) ? 0 : 1; /* if vref is >= 60%, use upper range */
+	u32 tap;
+	u32 refresh_mode;
+
+	if (range == 0)
+		tap = (vref - 6000) / 65;
+	else
+		tap = (vref - 4500) / 65;
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		cl = tm->interface_params[if_id].cas_l;
+		cwl = tm->interface_params[if_id].cas_wl;
+		t_ckclk = MEGA / mv_ddr_freq_get(tm->interface_params[if_id].memory_freq);
+		t_wr = time_to_nclk(mv_ddr_speed_bin_timing_get(tm->interface_params[if_id].speed_bin_index,
+					    SPEED_BIN_TWR), t_ckclk) - 1;
+
+		/* TODO: replace hard-coded values with appropriate defines */
+		/* DDR4 MR0 */
+		/*
+		 * [6:4,2] bits to be taken from S@R frequency and speed bin
+		 * rtt_nom to be taken from the algorithm definition
+		 * dic to be taken fro the algorithm definition -
+		 * set to 0x1 (for driver rzq/5 = 48 ohm) or
+		 * set to 0x0 (for driver rzq/7 = 34 ohm)
+		 */
+		/* set dll reset, 0x1900[8] to 0x1 */
+		/* set tm, 0x1900[7] to 0x0 */
+		/* set rbt, 0x1900[3] to 0x0 */
+		/* set bl, 0x1900[1:0] to 0x0 */
+		val = ((cl_mask_table[cl] & 0x1) << 2) |
+		      (((cl_mask_table[cl] & 0xe) >> 1)  <<  4) |
+		      (twr_mask_table[t_wr + 1] << 9) |
+		      (0x1 << 8) | (0x0 << 7) | (0x0 << 3) | 0x0;
+		mask = (0x1 << 2) | (0x7 << 4) | (0x7 << 9) |
+		       (0x1 << 8) | (0x1 << 7) | (0x1 << 3) | 0x3;
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR0_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+
+		/* DDR4 MR1 */
+		/* set rtt nom to 0 if rtt park is activated (not zero) */
+		if ((g_rtt_park >> 6) != 0x0)
+			g_rtt_nom = 0;
+		/* set tdqs, 0x1904[11] to 0x0 */
+		/* set al, 0x1904[4:3] to 0x0 */
+		/* dic, 0x1904[2:1] */
+		/* dll enable */
+		val = g_rtt_nom | (0x0 << 11) | (0x0 << 3) | (dic << 1) | 0x1;
+		mask = (0x7 << 8) | (0x1 << 11) | (0x3 << 3) | (0x3 << 1) | 0x1;
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR1_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+
+		/* DDR4 MR2 */
+		/* set rtt wr, 0x1908[10,9] to 0x0 */
+		/* set wr crc, 0x1908[12] to 0x0 */
+		/* cwl */
+		val = g_rtt_wr | (0x0 << 12) | (cwl_mask_table[cwl] << 3);
+		mask = (0x3 << 9) | (0x1 << 12) | (0x7 << 3);
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR2_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+
+		/* DDR4 MR3 */
+		/* set fgrm, 0x190C[8:6] to 0x0 */
+		/* set gd, 0x190C[3] to 0x0 */
+		refresh_mode = (tm->interface_params[if_id].interface_temp == MV_DDR_TEMP_HIGH) ? 1 : 0;
+
+		val = (refresh_mode << 6) | (0x0 << 3);
+		mask = (0x7 << 6) | (0x1 << 3);
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR3_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+
+		/* DDR4 MR4 */
+		/*
+		 * set wp, 0x1910[12] to 0x0
+		 * set rp, 0x1910[11] to 0x0
+		 * set rp training, 0x1910[10] to 0x0
+		 * set sra, 0x1910[9] to 0x0
+		 * set cs2cmd, 0x1910[8:6] to 0x0
+		 * set mpd, 0x1910[1] to 0x0
+		 */
+		mask = (0x1 << 12) | (0x1 << 11) | (0x1 << 10) | (0x1 << 9) | (0x7 << 6) | (0x1 << 1);
+		val =  (0x0 << 12) | (0x1 << 11) | (0x0 << 10) | (0x0 << 9) | (0x0 << 6) | (0x0 << 1);
+
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR4_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+
+		/* DDR4 MR5 */
+		/*
+		 * set rdbi, 0x1914[12] to 0x0 during init sequence (may be enabled with
+		 * op cmd mrs - bug in z1, to be fixed in a0)
+		 * set wdbi, 0x1914[11] to 0x0
+		 * set dm, 0x1914[10] to 0x1
+		 * set ca_pl, 0x1914[2:0] to 0x0
+		 * set odt input buffer during power down mode, 0x1914[5] to 0x1
+		 */
+		mask = (0x1 << 12) | (0x1 << 11) | (0x1 << 10) | (0x7 << 6) | (0x1 << 5) | 0x7;
+		val = (0x0 << 12) | (0x0 << 11) | (0x1 << 10) | g_rtt_park | (0x1 << 5) | 0x0;
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR5_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+
+		/* DDR4 MR6 */
+		/*
+		 * set t_ccd_l, 0x1918[12:10] to 0x0, 0x2, or 0x4 (z1 supports only even
+		 * values, to be fixed in a0)
+		 * set vdq te, 0x1918[7] to 0x0
+		 * set vdq tv, 0x1918[5:0] to vref training value
+		 */
+		mask = (0x7 << 10) | (0x1 << 7) | (0x1 << 6) | 0x3f;
+		val = (0x2 << 10) | (0x0 << 7) | (range << 6) | tap;
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR6_REG,
+					   val, mask);
+		if (status != MV_OK)
+			return status;
+	}
+
+	return MV_OK;
+}
+
+/* enter mpr read mode */
+static int mv_ddr4_mpr_read_mode_enable(u8 dev_num, u32 mpr_num, u32 page_num,
+				 enum mv_ddr4_mpr_read_format read_format)
+{
+	/*
+	 * enable MPR page 2 mpr mode in DDR4 MR3
+	 * read_format: 0 for serial, 1 for parallel, and 2 for staggered
+	 * TODO: add support for cs, multicast or unicast, and if id
+	 */
+	int status;
+	u32 val, mask, if_id = 0;
+
+	if (page_num != 0) {
+		/* serial is the only read format if the page is other than 0 */
+		read_format = MV_DDR4_MPR_READ_SERIAL;
+	}
+
+	val = (page_num << 0) | (0x1 << 2) | (read_format << 11);
+	mask = (0x3 << 0) | (0x1 << 2) | (0x3 << 11);
+
+	/* cs0 */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DDR4_MR3_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	/* op cmd: cs0, cs1 are on, cs2, cs3 are off */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, SDRAM_OP_REG,
+				   (0x9 | (0xc << 8)) , (0x1f | (0xf << 8)));
+	if (status != MV_OK)
+		return status;
+
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0, 0x1f, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_mpr_read_mode_enable: DDR3 poll failed(MPR3)\n"));
+	}
+
+	return MV_OK;
+}
+
+/* exit mpr read or write mode */
+static int mv_ddr4_mpr_mode_disable(u8 dev_num)
+{
+	 /* TODO: add support for cs, multicast or unicast, and if id */
+	int status;
+	u32 val, mask, if_id = 0;
+
+	/* exit mpr */
+	val =  0x0 << 2;
+	mask =  0x1 << 2;
+	/* cs0 */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DDR4_MR3_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	/* op cmd: cs0, cs1 are on, cs2, cs3 are off */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, SDRAM_OP_REG,
+				   (0x9 | (0xc << 8)) , (0x1f | (0xf << 8)));
+	if (status != MV_OK)
+		return status;
+
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0, 0x1f, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_mpr_mode_disable: DDR3 poll failed(MPR3)\n"));
+	}
+
+	return MV_OK;
+}
+
+/* translate dq read value per dram dq pin */
+static int mv_ddr4_dq_decode(u8 dev_num, u32 *data)
+{
+	u32 subphy_num, dq_num;
+	u32 dq_val = 0, raw_data, idx;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	u32 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+
+	/* suppose the third word is stable */
+	raw_data = data[2];
+
+	/* skip ecc supbhy; TODO: check to add support for ecc */
+	if (subphy_max % 2)
+		subphy_max -= 1;
+
+	for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+		for (dq_num = 0; dq_num < BUS_WIDTH_IN_BITS; dq_num++) {
+			idx = (dram_to_mc_dq_map[subphy_num][dq_num] + (subphy_num * BUS_WIDTH_IN_BITS));
+			dq_val |= (((raw_data & (1 << idx)) >> idx) << ((subphy_num * BUS_WIDTH_IN_BITS) + dq_num));
+		}
+	}
+
+	/* update burst words[0..7] with correct mapping */
+	for (idx = 0; idx < EXT_ACCESS_BURST_LENGTH; idx++)
+		data[idx] = dq_val;
+
+	return MV_OK;
+}
+
+/*
+ * read mpr value per requested format and type
+ * note: for parallel decoded read, data is presented as stored in mpr on dram side,
+ *	for all others, data to be presneted "as is" (i.e. per dq order from high to low
+ *	and bus pins connectivity).
+ */
+int mv_ddr4_mpr_read(u8 dev_num, u32 mpr_num, u32 page_num,
+		      enum mv_ddr4_mpr_read_format read_format,
+		      enum mv_ddr4_mpr_read_type read_type,
+		      u32 *data)
+{
+	/* TODO: add support for multiple if_id, dev num, and cs */
+	u32 word_idx, if_id = 0;
+	volatile unsigned long *addr = NULL;
+
+	/* enter mpr read mode */
+	mv_ddr4_mpr_read_mode_enable(dev_num, mpr_num, page_num, read_format);
+
+	/* set pattern type*/
+	ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DDR4_MPR_WR_REG,
+			  mpr_num << 8, 0x3 << 8);
+
+	for (word_idx = 0; word_idx < EXT_ACCESS_BURST_LENGTH; word_idx++) {
+		data[word_idx] = *addr;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("mv_ddr4_mpr_read: addr 0x%08lx, data 0x%08x\n",
+						     (unsigned long)addr, data[word_idx]));
+		addr++;
+	}
+
+	/* exit mpr read mode */
+	mv_ddr4_mpr_mode_disable(dev_num);
+
+	/* decode mpr read value (only parallel mode supported) */
+	if ((read_type == MV_DDR4_MPR_READ_DECODED) && (read_format == MV_DDR4_MPR_READ_PARALLEL)) {
+		if (dq_map_enable == 1) {
+			mv_ddr4_dq_decode(dev_num, data);
+		} else {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_mpr_read: run mv_ddr4_dq_pins_mapping()\n"));
+			return MV_FAIL;
+		}
+	}
+
+	return MV_OK;
+}
+
+/* enter mpr write mode */
+static int mv_ddr4_mpr_write_mode_enable(u8 dev_num, u32 mpr_location, u32 page_num, u32 data)
+{
+	/*
+	 * enable MPR page 2 mpr mode in DDR4 MR3
+	 * TODO: add support for cs, multicast or unicast, and if id
+	 */
+	int status;
+	u32 if_id = 0, val = 0, mask;
+
+	val = (page_num << 0) | (0x1 << 2);
+	mask = (0x3 << 0) | (0x1 << 2);
+	/* cs0 */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DDR4_MR3_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	/* cs0 */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DDR4_MPR_WR_REG,
+				   (mpr_location << 8) | data, 0x3ff);
+	if (status != MV_OK)
+		return status;
+
+	/* op cmd: cs0, cs1 are on, cs2, cs3 are off */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, SDRAM_OP_REG,
+				   (0x13 | 0xc << 8) , (0x1f | (0xf << 8)));
+	if (status != MV_OK)
+		return status;
+
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id,  0, 0x1f, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_mpr_write_mode_enable: DDR3 poll failed(MPR3)\n"));
+	}
+
+	return MV_OK;
+}
+
+/* write mpr value */
+int mv_ddr4_mpr_write(u8 dev_num, u32 mpr_location, u32 mpr_num, u32 page_num, u32 data)
+{
+	/* enter mpr write mode */
+	mv_ddr4_mpr_write_mode_enable(dev_num, mpr_location, page_num, data);
+
+	/* TODO: implement this function */
+
+	/* TODO: exit mpr write mode */
+
+	return MV_OK;
+}
+
+/*
+ * map physical on-board connection of dram dq pins to ddr4 controller pins
+ * note: supports only 32b width
+ * TODO: add support for 64-bit bus width and ecc subphy
+ */
+int mv_ddr4_dq_pins_mapping(u8 dev_num)
+{
+	static int run_once;
+	u8 dq_val[MAX_BUS_NUM][BUS_WIDTH_IN_BITS] = { {0} };
+	u32 mpr_pattern[MV_DDR4_MPR_READ_PATTERN_NUM][EXT_ACCESS_BURST_LENGTH] = { {0} };
+	u32 subphy_num, dq_num, mpr_type;
+	u8 subphy_pattern[3];
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	u32 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+
+	if (run_once)
+		return MV_OK;
+	else
+		run_once++;
+
+	/* clear dq mapping */
+	memset(dram_to_mc_dq_map, 0, sizeof(dram_to_mc_dq_map));
+
+	/* stage 1: read page 0 mpr0..2 raw patterns */
+	for (mpr_type = 0; mpr_type < MV_DDR4_MPR_READ_PATTERN_NUM; mpr_type++)
+		mv_ddr4_mpr_read(dev_num, mpr_type, 0, MV_DDR4_MPR_READ_PARALLEL,
+				 MV_DDR4_MPR_READ_RAW, mpr_pattern[mpr_type]);
+
+	/* stage 2: map every dq for each subphy to 3-bit value, create local database */
+	/* skip ecc supbhy; TODO: check to add support for ecc */
+	if (subphy_max % 2)
+		subphy_max -= 1;
+
+	for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+		/* extract pattern for each subphy */
+		for (mpr_type = 0; mpr_type < MV_DDR4_MPR_READ_PATTERN_NUM; mpr_type++)
+			subphy_pattern[mpr_type] = ((mpr_pattern[mpr_type][2] >> (subphy_num * 8)) & 0xff);
+
+		for (dq_num = 0; dq_num < BUS_WIDTH_IN_BITS; dq_num++)
+			for (mpr_type = 0; mpr_type < MV_DDR4_MPR_READ_PATTERN_NUM; mpr_type++)
+				dq_val[subphy_num][dq_num] += (((subphy_pattern[mpr_type] >> dq_num) & 1) *
+							       (1 << mpr_type));
+	}
+
+	/* stage 3: map dram dq to mc dq and update database */
+	for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+		for (dq_num = 0; dq_num < BUS_WIDTH_IN_BITS; dq_num++)
+			dram_to_mc_dq_map[subphy_num][7 - dq_val[subphy_num][dq_num]] = dq_num;
+	}
+
+	/* set dq_map_enable */
+	dq_map_enable = 1;
+
+	return MV_OK;
+}
+
+/* enter to or exit from dram vref training mode */
+int mv_ddr4_vref_training_mode_ctrl(u8 dev_num, u8 if_id, enum hws_access_type access_type, int enable)
+{
+	int status;
+	u32 val, mask;
+
+	/* DDR4 MR6 */
+	/*
+	 * set t_ccd_l, 0x1918[12:10] to 0x0, 0x2, or 0x4 (z1 supports only even
+	 * values, to be fixed in a0)
+	 * set vdq te, 0x1918[7] to 0x0
+	 * set vdq tv, 0x1918[5:0] to vref training value
+	 */
+
+	val = (((enable == 1) ? 1 : 0) << 7);
+	mask = (0x1 << 7);
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR6_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	/* write DDR4 MR6 cs configuration; only cs0, cs1 supported */
+	if (effective_cs == 0)
+		val = 0xe;
+	else
+		val = 0xd;
+	val <<= 8;
+	/* write DDR4 MR6 command */
+	val |= 0x12;
+	mask = (0xf << 8) | 0x1f;
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, SDRAM_OP_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id,  0, 0x1f, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_vref_training_mode_ctrl: Polling command failed\n"));
+	}
+
+	return MV_OK;
+}
+
+/* set dram vref tap value */
+int mv_ddr4_vref_tap_set(u8 dev_num, u8 if_id, enum hws_access_type access_type,
+			 u32 taps_num, enum mv_ddr4_vref_tap_state state)
+{
+	int status;
+	u32 range, vdq_tv;
+
+	/* disable and then enable the training with a new range */
+	if ((state == MV_DDR4_VREF_TAP_BUSY) && ((taps_num + MV_DDR4_VREF_STEP_SIZE) >= 23) &&
+	    (taps_num < 23))
+		state = MV_DDR4_VREF_TAP_FLIP;
+
+	if (taps_num < 23) {
+		range = 1;
+		vdq_tv = taps_num;
+	} else {
+		range = 0;
+		vdq_tv = taps_num - 23;
+	}
+
+	if ((state == MV_DDR4_VREF_TAP_FLIP) | (state == MV_DDR4_VREF_TAP_START)) {
+		/* 0 to disable */
+		status = mv_ddr4_vref_set(dev_num, if_id, access_type, range, vdq_tv, 0);
+		if (status != MV_OK)
+			return status;
+		/* 1 to enable */
+		status = (mv_ddr4_vref_set(dev_num, if_id, access_type, range, vdq_tv, 1));
+		if (status != MV_OK)
+			return status;
+	} else if (state == MV_DDR4_VREF_TAP_END) {
+		/* 1 to enable */
+		status = (mv_ddr4_vref_set(dev_num, if_id, access_type, range, vdq_tv, 1));
+		if (status != MV_OK)
+			return status;
+		/* 0 to disable */
+		status = mv_ddr4_vref_set(dev_num, if_id, access_type, range, vdq_tv, 0);
+		if (status != MV_OK)
+			return status;
+	} else {
+		/* 1 to enable */
+		status = (mv_ddr4_vref_set(dev_num, if_id, access_type, range, vdq_tv, 1));
+		if (status != MV_OK)
+			return status;
+	}
+
+	return MV_OK;
+}
+
+/* set dram vref value */
+int mv_ddr4_vref_set(u8 dev_num, u8 if_id, enum hws_access_type access_type,
+		     u32 range, u32 vdq_tv, u8 vdq_training_ena)
+{
+	int status;
+	u32 read_data;
+	u32 val, mask;
+
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("mv_ddr4_vref_set: range %d, vdq_tv %d\n", range, vdq_tv));
+
+	/* DDR4 MR6 */
+	/*
+	 * set t_ccd_l, 0x1918[12:10] to 0x0, 0x2, or 0x4 (z1 supports only even
+	 * values, to be fixed in a0)
+	 * set vdq te, 0x1918[7] to 0x0
+	 * set vdq tr, 0x1918[6] to 0x0 to disable or 0x1 to enable
+	 * set vdq tv, 0x1918[5:0] to vref training value
+	 */
+	val = (vdq_training_ena << 7) | (range << 6) | vdq_tv;
+	mask = (0x0 << 7) | (0x1 << 6) | 0x3f;
+
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR6_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	ddr3_tip_if_read(dev_num, access_type, if_id, DDR4_MR6_REG, &read_data, 0xffffffff);
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("mv_ddr4_vref_set: MR6 = 0x%x\n", read_data));
+
+	/* write DDR4 MR6 cs configuration; only cs0, cs1 supported */
+	if (effective_cs == 0)
+		val = 0xe;
+	else
+		val = 0xd;
+	val <<= 8;
+	/* write DDR4 MR6 command */
+	val |= 0x12;
+	mask = (0xf << 8) | 0x1f;
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, SDRAM_OP_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id,  0, 0x1F, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_vref_set: Polling command failed\n"));
+	}
+
+	return MV_OK;
+}
+
+/* pda - load pattern to odpg */
+int mv_ddr4_pda_pattern_odpg_load(u32 dev_num, enum hws_access_type access_type,
+				  u32 if_id, u32 subphy_mask, u32 cs_num)
+{
+	int status;
+	u32 pattern_len_count = 0;
+	u32 data_low[KILLER_PATTERN_LENGTH] = {0};
+	u32 data_high[KILLER_PATTERN_LENGTH] = {0};
+	u32 val, mask, subphy_num;
+
+	/*
+	 * set 0x1630[10:5] bits to 0x3 (0x1 for 16-bit bus width)
+	 * set 0x1630[14:11] bits to 0x3 (0x1 for 16-bit bus width)
+	 */
+	val = (cs_num << 26) | (0x1 << 25) | (0x3 << 11) | (0x3 << 5) | 0x1;
+	mask = (0x3 << 26) | (0x1 << 25) | (0x3f << 11) | (0x3f << 5) | 0x1;
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, ODPG_DATA_CTRL_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	if (subphy_mask != 0xf) {
+		for (subphy_num = 0; subphy_num < 4; subphy_num++)
+			if (((subphy_mask >> subphy_num) & 0x1) == 0)
+				data_low[0] = (data_low[0] | (0xff << (subphy_num * 8)));
+	} else
+		data_low[0] = 0;
+
+	for (pattern_len_count = 0; pattern_len_count < 4; pattern_len_count++) {
+		data_low[pattern_len_count] = data_low[0];
+		data_high[pattern_len_count] = data_low[0];
+	}
+
+	for (pattern_len_count = 0; pattern_len_count < 4 ; pattern_len_count++) {
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, ODPG_DATA_WR_DATA_LOW_REG,
+					   data_low[pattern_len_count], MASK_ALL_BITS);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, ODPG_DATA_WR_DATA_HIGH_REG,
+					   data_high[pattern_len_count], MASK_ALL_BITS);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_if_write(dev_num, access_type, if_id, ODPG_DATA_WR_ADDR_REG,
+					   pattern_len_count, MASK_ALL_BITS);
+		if (status != MV_OK)
+			return status;
+	}
+
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, ODPG_DATA_BUFFER_OFFS_REG,
+				   0x0, MASK_ALL_BITS);
+	if (status != MV_OK)
+		return status;
+
+	return MV_OK;
+}
+
+/* enable or disable pda */
+int mv_ddr4_pda_ctrl(u8 dev_num, u8 if_id, u8 cs_num, int enable)
+{
+	/*
+	 * if enable is 0, exit
+	 * mrs to be directed to all dram devices
+	 * a calling function responsible to change odpg to 0x0
+	 */
+
+	int status;
+	enum hws_access_type access_type = ACCESS_TYPE_UNICAST;
+	u32 val, mask;
+
+	/* per dram addressability enable */
+	val = ((enable == 1) ? 1 : 0);
+	val <<= 4;
+	mask = 0x1 << 4;
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, DDR4_MR3_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	/* write DDR4 MR3 cs configuration; only cs0, cs1 supported */
+	if (cs_num == 0)
+		val = 0xe;
+	else
+		val = 0xd;
+	val <<= 8;
+	/* write DDR4 MR3 command */
+	val |= 0x9;
+	mask = (0xf << 8) | 0x1f;
+	status = ddr3_tip_if_write(dev_num, access_type, if_id, SDRAM_OP_REG, val, mask);
+	if (status != MV_OK)
+		return status;
+
+	if (enable == 0) {
+		/* check odpg access is done */
+		if (mv_ddr_is_odpg_done(MAX_POLLING_ITERATIONS) != MV_OK)
+			return MV_FAIL;
+	}
+
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0, 0x1f, SDRAM_OP_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_pda_ctrl: Polling command failed\n"));
+
+	return MV_OK;
+}
+#endif /* CONFIG_DDR4 */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h b/drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
new file mode 100644
index 0000000000..347a1b2237
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#ifndef _MV_DDR4_MPR_PDA_IF_H
+#define _MV_DDR4_MPR_PDA_IF_H
+
+#include "ddr3_init.h"
+#include "mv_ddr_common.h"
+
+#define MV_DDR4_VREF_STEP_SIZE	3
+#define MV_DDR4_VREF_MIN_RANGE	1
+#define MV_DDR4_VREF_MAX_RANGE	73
+#define MV_DDR4_VREF_MAX_COUNT	(((MV_DDR4_VREF_MAX_RANGE - MV_DDR4_VREF_MIN_RANGE) / MV_DDR4_VREF_STEP_SIZE) + 2)
+
+#define MV_DDR4_MPR_READ_PATTERN_NUM	3
+
+enum mv_ddr4_mpr_read_format {
+	MV_DDR4_MPR_READ_SERIAL,
+	MV_DDR4_MPR_READ_PARALLEL,
+	MV_DDR4_MPR_READ_STAGGERED,
+	MV_DDR4_MPR_READ_RSVD_TEMP
+};
+
+enum mv_ddr4_mpr_read_type {
+	MV_DDR4_MPR_READ_RAW,
+	MV_DDR4_MPR_READ_DECODED
+};
+
+enum mv_ddr4_vref_tap_state {
+	MV_DDR4_VREF_TAP_START,
+	MV_DDR4_VREF_TAP_BUSY,
+	MV_DDR4_VREF_TAP_FLIP,
+	MV_DDR4_VREF_TAP_END
+};
+
+int mv_ddr4_mode_regs_init(u8 dev_num);
+int mv_ddr4_mpr_read(u8 dev_num, u32 mpr_num, u32 page_num,
+		     enum mv_ddr4_mpr_read_format read_format,
+		     enum mv_ddr4_mpr_read_type read_type,
+		     u32 *data);
+int mv_ddr4_mpr_write(u8 dev_num, u32 mpr_location, u32 mpr_num,
+		      u32 page_num, u32 data);
+int mv_ddr4_dq_pins_mapping(u8 dev_num);
+int mv_ddr4_vref_training_mode_ctrl(u8 dev_num, u8 if_id,
+				 enum hws_access_type access_type,
+				 int enable);
+int mv_ddr4_vref_tap_set(u8 dev_num, u8 if_id,
+			 enum hws_access_type access_type,
+			 u32 taps_num,
+			 enum mv_ddr4_vref_tap_state state);
+int mv_ddr4_vref_set(u8 dev_num, u8 if_id, enum hws_access_type access_type,
+		     u32 range, u32 vdq_tv, u8 vdq_training_ena);
+int mv_ddr4_pda_pattern_odpg_load(u32 dev_num, enum hws_access_type access_type,
+				  u32 if_id, u32 subphy_mask, u32 cs_num);
+int mv_ddr4_pda_ctrl(u8 dev_num, u8 if_id, u8 cs_num, int enable);
+
+#endif /* _MV_DDR4_MPR_PDA_IF_H */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training.c b/drivers/ddr/marvell/a38x/mv_ddr4_training.c
new file mode 100644
index 0000000000..cefc2e8b40
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training.c
@@ -0,0 +1,565 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#if defined(CONFIG_DDR4)
+
+/* DDR4 training service API and data structures */
+
+#include "ddr3_init.h"
+#include "mv_ddr4_training.h"
+#include "mv_ddr4_mpr_pda_if.h"
+#include "mv_ddr4_training_leveling.h"
+#include "mv_ddr4_training_calibration.h"
+#include "mv_ddr_regs.h"
+
+/* 1 for wa and sstl and pod to get the same vref value */
+u8 vref_calibration_wa = 1;
+
+static int a39x_z1_config(u32 dev_num);
+
+/* vref values for vcommon */
+static u16 vref_val[] = {
+	746,
+	654,
+	671,
+	686,
+	701,
+	713,
+	725,
+	736
+};
+
+static u32 mv_ddr4_config_phy_vref_tap;
+
+/* configure DDR4 SDRAM */
+int mv_ddr4_sdram_config(u32 dev_num)
+{
+	/* TODO: zq params to be frequency dependent */
+	u32 zq_init = 1023;
+	u32 zq_oper = 511;
+	u32 zq_cs = 127;
+	u32 if_id;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	int status;
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+
+		/* dtype: 0x3 for DDR4, 0x1 for DDR3 */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, SDRAM_CFG_REG,
+					   (0x1 << 14) | (0x1 << 20), (0x1 << 14) | (0x1 << 20));
+		if (status != MV_OK)
+			return status;
+
+		/* cpm */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DRAM_PINS_MUX_REG,
+					   0x2, 0x3);
+		if (status != MV_OK)
+			return status;
+
+		/*
+		 * set t_dllk to 1024 to the maximum of minimum for high speed bin
+		 * TODO: may change for future speed bins
+		 */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DRAM_DLL_TIMING_REG,
+					   0x400, 0xfff);
+		if (status != MV_OK)
+			return status;
+
+		/* set zq_init */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DRAM_ZQ_INIT_TIMIMG_REG,
+					   zq_init, 0xfff);
+		if (status != MV_OK)
+			return status;
+
+		/* set zq_oper */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DRAM_ZQ_TIMING_REG,
+					   zq_oper, 0x7ff);
+		if (status != MV_OK)
+			return status;
+
+		/* set zq_cs */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DRAM_ZQ_TIMING_REG,
+					   zq_cs << 16, 0x3ff0000);
+		if (status != MV_OK)
+			return status;
+
+		/*
+		 * set registered dimm to unbuffered dimm
+		 * TODO: support registered dimm
+		 */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, SDRAM_CFG_REG,
+					   0x0, 0x1 << 17);
+		if (status != MV_OK)
+			return status;
+	}
+
+	a39x_z1_config(dev_num);
+
+	return MV_OK;
+}
+
+u16 mv_ddr4_rtt_nom_to_odt(u16 rtt_nom)
+{
+	u8 odt;
+
+	if (rtt_nom == 0)
+		odt = 0xff;
+	else if (rtt_nom == (1 << 8))
+		odt = 60; /* 240 / 4 */
+	else if (rtt_nom == (2 << 8))
+		odt = 120; /* 240 / 2 */
+	else if (rtt_nom == (3 << 8))
+		odt = 40; /* 240 / 6 */
+	else if (rtt_nom == (4 << 8))
+		odt = 240; /* 240 / 1 */
+	else if (rtt_nom == (5 << 8))
+		odt = 48; /* 240 / 5 */
+	else if (rtt_nom == (6 << 8))
+		odt = 80; /* 240 / 3 */
+	else if (rtt_nom == (7 << 8))
+		odt = 34; /* 240 / 7 */
+	else
+		odt = 1;
+
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("mv_ddr4_rtt_nom_to_odt: rtt_nom = %d, odt = %d\n", rtt_nom, odt));
+
+	return odt;
+}
+
+u16 mv_ddr4_rtt_wr_to_odt(u16 rtt_wr)
+{
+	u8 odt;
+
+	if (rtt_wr == 0)
+		odt = 0xff;
+	else if (rtt_wr == (1 << 9))
+		odt = 120; /* 240 / 2 */
+	else if (rtt_wr == (2 << 9))
+		odt = 240; /* 240 / 1 */
+	else
+		odt = 1;
+
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("mv_ddr4_rtt_wr_to_odt rtt_wr = %d, odt = %d\n", rtt_wr, odt));
+
+	return odt;
+}
+
+static u32 mv_ddr4_rx_odt_get(void)
+{
+	u16 odt = odt_intercept[(int)g_zpodt_data / 8] - (g_zpodt_data * odt_slope[(int)g_zpodt_data / 8]) / 100;
+	u16 rtt;
+
+	if (g_odt_config & 0xf) {
+		rtt = mv_ddr4_rtt_nom_to_odt(g_rtt_nom);
+		odt = (odt * rtt) / (odt + rtt);
+	}
+
+	return odt;
+}
+
+static u8 mv_ddr4_vcommon_to_vref(u16 vcommon)
+{
+	u8 vref_tap;
+
+	if ((vcommon > 600) && (vcommon <= 662)) {
+		vref_tap = 1;
+	} else if ((vcommon > 662) && (vcommon <= 679)) {
+		vref_tap = 2;
+	} else if ((vcommon > 679) && (vcommon <= 693)) {
+		vref_tap = 3;
+	} else if ((vcommon > 693) && (vcommon <= 707)) {
+		vref_tap = 4;
+	} else if ((vcommon > 707) && (vcommon <= 719)) {
+		vref_tap = 5;
+	} else if ((vcommon > 719) && (vcommon <= 725)) {
+		vref_tap = 6;
+	} else if ((vcommon > 725) && (vcommon <= 731)) {
+		vref_tap = 7;
+	} else if ((vcommon > 731) && (vcommon <= 800)) {
+		vref_tap = 0;
+	} else if (vcommon > 800) {
+		vref_tap = 0;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_vcommon_to_vref: warning: vcommon value too high: %d\n", vcommon));
+	} else if (vcommon < 600) {
+		vref_tap = 1;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_vcommon_to_vref: warning: vcommon value too low: %d\n", vcommon));
+	} else {
+		vref_tap = 1;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_vcommon_to_vref: warning: vcommon out of range: %d\n", vcommon));
+	}
+
+	return vref_tap;
+}
+
+/* configure phy */
+int mv_ddr4_phy_config(u32 dev_num)
+{
+	u8 cs, i, pod_val;
+	u32 upper_pcal, left_pcal, upper_ncal;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	/* design GL params to be set outside */
+	u32 ron = 34; /* dic - rzq / 6 or rzq / 7 */
+	u32 rodt = mv_ddr4_rx_odt_get(); /* effective odt per DGL */
+	u32 vcommon = (1000 * (ron + rodt / 2)) / (ron + rodt);
+	u32 vref_idx;
+	u8 rc_tap;
+	u8 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+	int status;
+
+	mv_ddr4_config_phy_vref_tap = mv_ddr4_vcommon_to_vref(vcommon);
+
+	/* change calculation for 1GHz frequency */
+	if (tm->interface_params[0].memory_freq == MV_DDR_FREQ_1000)
+		mv_ddr4_config_phy_vref_tap += 2;
+
+	vref_idx = (mv_ddr4_config_phy_vref_tap < 8) ? mv_ddr4_config_phy_vref_tap : 0;
+	rc_tap = (430 * (vref_val[vref_idx] - vcommon)) / 1000 + 33;
+	/* 0x1 for pod mode */
+	pod_val = (vref_calibration_wa == 1) ? 0x0 : 0x1;
+	upper_pcal = pod_val;
+	left_pcal = pod_val;
+	upper_ncal = 0;
+
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+				    PARAM_NOT_CARE, DDR_PHY_DATA, TEST_ADLL_PHY_REG, pod_val);
+	if (status != MV_OK)
+		return status;
+
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, GP_RSVD0_REG,
+				   (upper_pcal << 12) | (left_pcal << 6) | (upper_ncal << 5), 0x1060);
+	if (status != MV_OK)
+		return status;
+
+	/*
+	 * phy register 0xbf, bit 0 - configure to pod mode (0x1)
+	 * phy register 0xa8, bits [6:4] - configure to clamp (0x0)
+	 * subphys (broadcast) register 0xa8, bits [2:0] - configure to int ref m (0x4),
+	 * TODO: need to write it to control subphys too
+	 * vref tap - configure to SSTL calibration only (4)
+	 * enhanced vref value - set to no clamp (0)
+	 */
+	for (i = 0; i < subphy_max; i++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, i);
+		ddr3_tip_bus_read_modify_write(dev_num, ACCESS_TYPE_UNICAST, 0, i, DDR_PHY_DATA, PAD_CFG_PHY_REG,
+					       (0 << 4) | 4, ((0x7 << 4) | 0x7));
+	}
+
+	for (i = 0; i < 3; i++)
+		ddr3_tip_bus_read_modify_write(dev_num, ACCESS_TYPE_UNICAST, 0, i, DDR_PHY_CONTROL, PAD_CFG_PHY_REG,
+					       (0 << 4) | 4 , ((0x7 << 4) | 0x7));
+
+	/* phy register 0xa4, bits [13:7] - configure to 0x7c zpri /znri */
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+				    PARAM_NOT_CARE, DDR_PHY_DATA, PAD_ZRI_CAL_PHY_REG,
+				    ((0x7f & g_zpri_data) << 7) | (0x7f & g_znri_data));
+	if (status != MV_OK)
+		return status;
+
+	/*
+	 * phy register 0xa6, bits [5:0] - configure to znodt (0x0)
+	 * phy register 0xa6 bits [11:6] - configure to zpodt (60Ohm, 0x1d)
+	 */
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+				    PARAM_NOT_CARE, DDR_PHY_DATA, PAD_ODT_CAL_PHY_REG, g_zpodt_data << 6);
+	if (status != MV_OK)
+		return status;
+
+	/* update for all active cs */
+	for (cs = 0; cs < MAX_CS_NUM; cs++) {
+		/*
+		 * writes to present cs only
+		 * phy register 0xdb, bits [5:0] - configure to rcvr cal for 50% duty cycle,
+		 * broadcast to all bits cs0 (0x26)
+		 */
+		status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+					    PARAM_NOT_CARE, DDR_PHY_DATA, VREF_BCAST_PHY_REG(cs), rc_tap);
+		if (status != MV_OK)
+			return status;
+	}
+
+	return MV_OK;
+}
+
+/*
+ * configure sstl for manual calibration and pod for automatic one
+ * assumes subphy configured to pod ealier
+ */
+int mv_ddr4_calibration_adjust(u32 dev_num, u8 vref_en, u8 pod_only)
+{
+	u8 i, if_id = 0;
+	u32 read_data[MAX_INTERFACE_NUM];
+	u32 ncal = 0, pcal = 0;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	int status = MV_OK;
+	u8 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+	u8  vref_tap = mv_ddr4_config_phy_vref_tap;
+	u32 vref_idx = (vref_tap < 8) ? vref_tap : 0;
+
+	if (vref_calibration_wa == 0)
+		return mv_ddr4_calibration_validate(dev_num);
+
+	if (vref_en == 1) {
+		/* enhanced vref value set to no clamp (0) */
+		for (i = 0; i < subphy_max; i++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, i);
+			ddr3_tip_bus_read_modify_write(dev_num, ACCESS_TYPE_UNICAST, 0, i, DDR_PHY_DATA,
+						       PAD_CFG_PHY_REG, (0 << 4) | vref_idx, ((0x7 << 4) | 0x7));
+		}
+
+		for (i = 0; i < 3; i++)
+			ddr3_tip_bus_read_modify_write(dev_num, ACCESS_TYPE_UNICAST, 0, i, DDR_PHY_CONTROL,
+						       PAD_CFG_PHY_REG, (0 << 4) | vref_idx, ((0x7 << 4) | 0x7));
+	}
+
+	/* pad calibration control - enable */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, MAIN_PADS_CAL_MACH_CTRL_REG,
+				   (calibration_update_control << 3) | 0x1, (0x3 << 3) | 0x1);
+	if (status != MV_OK)
+		return status;
+
+	/* calibration update external */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id,
+				   MAIN_PADS_CAL_MACH_CTRL_REG, 0x2 << 3, 0x3 << 3);
+	if (status != MV_OK)
+		return status;
+
+	/* poll init calibration done */
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x80000000, 0x80000000,
+				MAIN_PADS_CAL_MACH_CTRL_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: calibration polling failed (0)\n"));
+
+	/* poll calibration propogated to io */
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x3ffffff, 0x3ffffff, 0x1674,
+				MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: calibration polling failed (1)\n"));
+
+	mdelay(10); /* TODO: check it */
+
+	/* disable dynamic */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, MAIN_PADS_CAL_MACH_CTRL_REG, 0, 0x1);
+	if (status != MV_OK)
+		return status;
+
+	/* poll initial calibration done*/
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x80000000, 0x80000000,
+				MAIN_PADS_CAL_MACH_CTRL_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: calibration polling failed (2)\n"));
+
+	/* poll calibration propogated to io */
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x3ffffff, 0x3ffffff, 0x1674,
+				MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: calibration polling failed (3)\n"));
+
+	mdelay(10); /* TODO: check why polling insufficient */
+
+	/* read calibration value and set it manually */
+	status = ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x1dc8, read_data, MASK_ALL_BITS);
+	if (status != MV_OK)
+		return status;
+
+	ncal = (read_data[if_id] & (0x3f << 10)) >> 10;
+	pcal = (read_data[if_id] & (0x3f << 4)) >> 4;
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+			  ("mv_ddr4_calibration_adjust: sstl pcal = 0x%x, ncal = 0x%x\n",
+			   pcal, ncal));
+	if ((ncal >= 56) || (ncal <= 6) || (pcal >= 59) || (pcal <= 7)) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: error: sstl pcal = 0x%x, ncal = 0x%x out of range\n",
+				   pcal, ncal));
+		status = MV_FAIL;
+	}
+
+	if (pod_only == 0) {
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x1dc8, 0x1 << 3, 0x1 << 3);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x1dc8,
+					   (ncal << 22) | (pcal << 16), (0x3f << 22) | (0x3f << 16));
+		if (status != MV_OK)
+			return status;
+
+		/* configure to pod mode (0x1) */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+					   GP_RSVD0_REG,
+					   (0x1 << 12) | (0x1 << 6) | (0x1 << 5), 0x1060);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+					    PARAM_NOT_CARE, DDR_PHY_DATA, TEST_ADLL_PHY_REG, 0x1);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+					    PARAM_NOT_CARE, DDR_PHY_CONTROL, TEST_ADLL_PHY_REG, 0x1);
+		if (status != MV_OK)
+			return status;
+
+		/* pad calibration control - enable */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, MAIN_PADS_CAL_MACH_CTRL_REG,
+					   0x1, 0x1);
+		if (status != MV_OK)
+			return status;
+
+		/* poll initial calibration done*/
+		if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x80000000, 0x80000000,
+					MAIN_PADS_CAL_MACH_CTRL_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+					  ("mv_ddr4_calibration_adjust: calibration polling failed (4)\n"));
+	}
+
+	/* calibration update internal */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, MAIN_PADS_CAL_MACH_CTRL_REG,
+				   calibration_update_control << 3, 0x3 << 3);
+	if (status != MV_OK)
+		return status;
+
+	/* vertical */
+	status = ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x14c8, read_data, MASK_ALL_BITS);
+	if (status != MV_OK)
+		return status;
+	ncal = (read_data[if_id] & (0x3f << 10)) >> 10;
+	pcal = (read_data[if_id] & (0x3f << 4)) >> 4;
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+			  ("mv_ddr4_calibration_adjust: pod-v pcal = 0x%x, ncal = 0x%x\n",
+			   pcal, ncal));
+	if ((ncal >= 56) || (ncal <= 6) || (pcal >= 59) || (pcal <= 7)) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: error: pod-v pcal = 0x%x, ncal = 0x%x out of range\n",
+				   pcal, ncal));
+		status = MV_FAIL;
+	}
+
+	/* horizontal */
+	status = ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x17c8, read_data, MASK_ALL_BITS);
+	if (status != MV_OK)
+		return status;
+	ncal = (read_data[if_id] & (0x3f << 10)) >> 10;
+	pcal = (read_data[if_id] & (0x3F << 4)) >> 4;
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+			  ("mv_ddr4_calibration_adjust: pod-h pcal = 0x%x, ncal = 0x%x\n",
+			   pcal, ncal));
+	if ((ncal >= 56) || (ncal <= 6) || (pcal >= 59) || (pcal <= 7)) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("mv_ddr4_calibration_adjust: error: pod-h pcal = 0x%x, ncal = 0x%x out of range\n",
+				   pcal, ncal));
+		status = MV_FAIL;
+	}
+	/* pad calibration control - disable */
+	status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, MAIN_PADS_CAL_MACH_CTRL_REG,
+				   (calibration_update_control << 3) | 0x0, (0x3 << 3) | 0x1);
+	if (status != MV_OK)
+		return status;
+
+    return status;
+}
+
+static int a39x_z1_config(u32 dev_num)
+{
+	u32 if_id;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	int status;
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		/*
+		 * xbar split bypass - dlb is off,
+		 * when enabled, set to 0x1
+		 */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x1424, 0x0 << 3, 0x1 << 3);
+		if (status != MV_OK)
+			return status;
+
+		/* auto power save option */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x1474, 0x0, 0xffffffff);
+		if (status != MV_OK)
+			return status;
+	}
+
+	return MV_OK;
+}
+
+int mv_ddr4_training_main_flow(u32 dev_num)
+{
+	int status = MV_OK;
+	u16 pbs_tap_factor[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS] = {0};
+
+	if (mask_tune_func & RECEIVER_CALIBRATION_MASK_BIT) {
+		training_stage = RECEIVER_CALIBRATION;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("RECEIVER_CALIBRATION_MASK_BIT #%d\n", effective_cs));
+		status = mv_ddr4_receiver_calibration(dev_num);
+		if (is_reg_dump != 0)
+			ddr3_tip_reg_dump(dev_num);
+		if (status != MV_OK) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_receiver_calibrate failure\n"));
+			if (debug_mode == 0)
+				return status;
+		}
+	}
+
+	if (mask_tune_func & WL_PHASE_CORRECTION_MASK_BIT) {
+		training_stage = WL_PHASE_CORRECTION;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("WL_PHASE_CORRECTION_MASK_BIT #%d\n", effective_cs));
+		status = mv_ddr4_dynamic_wl_supp(dev_num);
+		if (is_reg_dump != 0)
+			ddr3_tip_reg_dump(dev_num);
+		if (status != MV_OK) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_dynamic_wl_supp failure\n"));
+			if (debug_mode == 0)
+				return status;
+		}
+	}
+
+	if (mask_tune_func & DQ_VREF_CALIBRATION_MASK_BIT) {
+		training_stage = DQ_VREF_CALIBRATION;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("DQ_VREF_CALIBRATION_MASK_BIT #%d\n", effective_cs));
+		status = mv_ddr4_dq_vref_calibration(dev_num, pbs_tap_factor);
+		if (is_reg_dump != 0)
+			ddr3_tip_reg_dump(dev_num);
+		if (status != MV_OK) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_dq_vref_calibrate failure\n"));
+			if (debug_mode == 0)
+				return status;
+		}
+	}
+
+	if (mask_tune_func & DM_TUNING_MASK_BIT) {
+		training_stage = DM_TUNING;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("DM_TUNING_MASK_BIT #%d\n", effective_cs));
+		status = mv_ddr4_dm_tuning(effective_cs, pbs_tap_factor);
+		if (is_reg_dump != 0)
+			ddr3_tip_reg_dump(dev_num);
+		if (status != MV_OK) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_dm_tuning failure\n"));
+			if (debug_mode == 0)
+				return status;
+		}
+	}
+
+	if (mask_tune_func & DQ_MAPPING_MASK_BIT) {
+		training_stage = DQ_MAPPING;
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO, ("DQ_MAPPING_MASK_BIT\n"));
+		status = mv_ddr4_dq_pins_mapping(dev_num);
+		if (is_reg_dump != 0)
+			ddr3_tip_reg_dump(dev_num);
+		if (status != MV_OK) {
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("mv_ddr4_dq_pins_mapping failure\n"));
+			if (debug_mode == 0)
+				return status;
+		}
+	}
+
+	return status;
+}
+#endif /* CONFIG_DDR4 */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training.h b/drivers/ddr/marvell/a38x/mv_ddr4_training.h
new file mode 100644
index 0000000000..fa2e9a0877
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#ifndef _MV_DDR4_TRAINING_H
+#define _MV_DDR4_TRAINING_H
+
+#include "ddr3_training_ip.h"
+
+/* configure DDR4 SDRAM */
+int mv_ddr4_sdram_config(u32 dev_num);
+
+/* configure phy */
+int mv_ddr4_phy_config(u32 dev_num);
+
+/*
+ * configure sstl for manual calibration and pod for automatic one
+ * assumes subphy configured to pod ealier
+ */
+int mv_ddr4_calibration_adjust(u32 dev_num, u8 vref_en, u8 pod_only);
+
+/*
+ * validates calibration values
+ * soc dependent; TODO: check it
+ */
+int mv_ddr4_calibration_validate(u32 dev_num);
+
+u16 mv_ddr4_rtt_nom_to_odt(u16 rtt_nom);
+u16 mv_ddr4_rtt_wr_to_odt(u16 rtt_wr);
+
+#endif /* _MV_DDR4_TRAINING_H */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c b/drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
new file mode 100644
index 0000000000..31b6209416
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
@@ -0,0 +1,2336 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#if defined(CONFIG_DDR4)
+
+/* DESCRIPTION: DDR4 Receiver and DQVref Calibration */
+
+#include "ddr3_init.h"
+#include "mv_ddr4_training_calibration.h"
+#include "mv_ddr4_training.h"
+#include "mv_ddr4_mpr_pda_if.h"
+#include "mv_ddr_training_db.h"
+#include "mv_ddr_regs.h"
+
+#define RX_DIR			0
+#define TX_DIR			1
+#define MAX_DIR_TYPES		2
+
+#define RECEIVER_DC_STEP_SIZE	3
+#define RECEIVER_DC_MIN_RANGE	0
+#define RECEIVER_DC_MAX_RANGE	63
+#define RECEIVER_DC_MAX_COUNT	(((RECEIVER_DC_MAX_RANGE - RECEIVER_DC_MIN_RANGE) / RECEIVER_DC_STEP_SIZE) + 1)
+
+#define PBS_VAL_FACTOR		1000
+#define MV_DDR_VW_TX_NOISE_FILTER	8	/* adlls */
+
+u8 dq_vref_vec[MAX_BUS_NUM];	/* stability support */
+u8 rx_eye_hi_lvl[MAX_BUS_NUM];	/* rx adjust support */
+u8 rx_eye_lo_lvl[MAX_BUS_NUM];	/* rx adjust support */
+
+static u8 pbs_max = 31;
+static u8 vdq_tv; /* vref value for dq vref calibration */
+static u8 duty_cycle; /* duty cycle value for receiver calibration */
+static u8 rx_vw_pos[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+static u8 patterns_byte_status[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+static const char *str_dir[MAX_DIR_TYPES] = {"read", "write"};
+
+static u8 center_low_element_get(u8 dir, u8 pbs_element, u16 lambda, u8 pbs_max_val)
+{
+	u8 result;
+
+	if (dir == RX_DIR)
+		result = pbs_element * lambda / PBS_VAL_FACTOR;
+	else
+		result = (pbs_max_val - pbs_element) * lambda / PBS_VAL_FACTOR;
+
+	return result;
+}
+
+static u8 center_high_element_get(u8 dir, u8 pbs_element, u16 lambda, u8 pbs_max_val)
+{
+	u8 result;
+
+	if (dir == RX_DIR)
+		result = (pbs_max_val - pbs_element) * lambda / PBS_VAL_FACTOR;
+	else
+		result = pbs_element * lambda / PBS_VAL_FACTOR;
+
+	return result;
+}
+
+static int mv_ddr4_centralization(u8 dev_num, u16 (*lambda)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS], u8 (*copt)[MAX_BUS_NUM],
+				  u8 (*pbs_result)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS], u8 (*vw_size)[MAX_BUS_NUM],
+				  u8 mode, u16 param0, u8 param1);
+static int mv_ddr4_dqs_reposition(u8 dir, u16 *lambda, u8 *pbs_result, char delta, u8 *copt, u8 *dqs_pbs);
+static int mv_ddr4_copt_get(u8 dir, u16 *lambda, u8 *vw_l, u8 *vw_h, u8 *pbs_result, u8 *copt);
+static int mv_ddr4_center_of_mass_calc(u8 dev_num, u8 if_id, u8 subphy_num, u8 mode, u8 *vw_l, u8 *vw_h, u8 *vw_v,
+				       u8 vw_num, u8 *v_opt, u8 *t_opt);
+static int mv_ddr4_tap_tuning(u8 dev_num, u16 (*pbs_tap_factor)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS], u8 mode);
+
+/* dq vref calibration flow */
+int mv_ddr4_dq_vref_calibration(u8 dev_num, u16 (*pbs_tap_factor)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS])
+{
+	u32 if_id, subphy_num;
+	u32 vref_idx, dq_idx, pad_num = 0;
+	u8 dq_vref_start_win[MAX_INTERFACE_NUM][MAX_BUS_NUM][MV_DDR4_VREF_MAX_COUNT];
+	u8 dq_vref_end_win[MAX_INTERFACE_NUM][MAX_BUS_NUM][MV_DDR4_VREF_MAX_COUNT];
+	u8 valid_win_size[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 c_opt_per_bus[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 valid_vref_cnt[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 valid_vref_ptr[MAX_INTERFACE_NUM][MAX_BUS_NUM][MV_DDR4_VREF_MAX_COUNT];
+	u8 center_adll[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 center_vref[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 pbs_res_per_bus[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	u16 vref_avg, vref_subphy_num;
+	int vref_tap_idx;
+	int vref_range_min;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	enum mv_ddr4_vref_subphy_cal_state all_subphys_state = MV_DDR4_VREF_SUBPHY_CAL_ABOVE;
+	int tap_tune_passed = 0;
+	enum mv_ddr4_vref_tap_state vref_tap_set_state = MV_DDR4_VREF_TAP_START;
+	enum hws_result *flow_result = ddr3_tip_get_result_ptr(training_stage);
+	u8 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+	enum mv_ddr4_vref_subphy_cal_state vref_state_per_subphy[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	int status;
+	static u8 vref_byte_status[MAX_INTERFACE_NUM][MAX_BUS_NUM][MV_DDR4_VREF_MAX_RANGE];
+
+	DEBUG_CALIBRATION(DEBUG_LEVEL_INFO, ("Starting ddr4 dq vref calibration training stage\n"));
+
+	vdq_tv = 0;
+	duty_cycle = 0;
+
+	/* reset valid vref counter per if and subphy */
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		for (subphy_num = 0; subphy_num < MAX_BUS_NUM; subphy_num++) {
+			valid_vref_cnt[if_id][subphy_num] = 0;
+			vref_state_per_subphy[if_id][subphy_num] = MV_DDR4_VREF_SUBPHY_CAL_ABOVE;
+		}
+	}
+
+	if (mv_ddr4_tap_tuning(dev_num, pbs_tap_factor, TX_DIR) == MV_OK)
+		tap_tune_passed = 1;
+
+	/* place dram to vref training mode */
+	mv_ddr4_vref_training_mode_ctrl(dev_num, 0, ACCESS_TYPE_MULTICAST, 1);
+
+	/* main loop for 2d scan (low_to_high voltage scan) */
+	vref_tap_idx = MV_DDR4_VREF_MAX_RANGE;
+	vref_range_min = MV_DDR4_VREF_MIN_RANGE;
+
+	if (vref_range_min < MV_DDR4_VREF_STEP_SIZE)
+		vref_range_min = MV_DDR4_VREF_STEP_SIZE;
+
+	/* clean vref status array */
+	memset(vref_byte_status, BYTE_NOT_DEFINED, sizeof(vref_byte_status));
+
+	for (vref_tap_idx = MV_DDR4_VREF_MAX_RANGE; (vref_tap_idx >= vref_range_min) &&
+	     (all_subphys_state != MV_DDR4_VREF_SUBPHY_CAL_UNDER);
+	     vref_tap_idx -= MV_DDR4_VREF_STEP_SIZE) {
+		/* set new vref training value in dram */
+		mv_ddr4_vref_tap_set(dev_num, 0, ACCESS_TYPE_MULTICAST, vref_tap_idx, vref_tap_set_state);
+
+		if (tap_tune_passed == 0) {
+			if (mv_ddr4_tap_tuning(dev_num, pbs_tap_factor, TX_DIR) == MV_OK)
+				tap_tune_passed = 1;
+			else
+				continue;
+		}
+
+		if (mv_ddr4_centralization(dev_num, pbs_tap_factor, c_opt_per_bus, pbs_res_per_bus,
+					   valid_win_size, TX_DIR, vref_tap_idx, 0) != MV_OK) {
+			DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+					  ("error: %s: ddr4 centralization failed (dq vref tap index %d)!!!\n",
+					   __func__, vref_tap_idx));
+			continue;
+		}
+
+		/* go over all results and find out the vref start and end window */
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				if (valid_win_size[if_id][subphy_num] > MV_DDR_VW_TX_NOISE_FILTER) {
+					if (vref_state_per_subphy[if_id][subphy_num] == MV_DDR4_VREF_SUBPHY_CAL_UNDER)
+						DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+								  ("warning: %s: subphy %d vref tap %d voltage noise\n",
+								   __func__, subphy_num, vref_tap_idx));
+					/* window is valid; keep current vref_tap_idx value and increment counter */
+					vref_idx = valid_vref_cnt[if_id][subphy_num];
+					valid_vref_ptr[if_id][subphy_num][vref_idx] = vref_tap_idx;
+					valid_vref_cnt[if_id][subphy_num]++;
+
+					/* set 0 for possible negative values */
+					vref_byte_status[if_id][subphy_num][vref_idx] |=
+						patterns_byte_status[if_id][subphy_num];
+					dq_vref_start_win[if_id][subphy_num][vref_idx] =
+						c_opt_per_bus[if_id][subphy_num] + 1 -
+						valid_win_size[if_id][subphy_num] / 2;
+					dq_vref_start_win[if_id][subphy_num][vref_idx] =
+						(valid_win_size[if_id][subphy_num] % 2 == 0) ?
+						dq_vref_start_win[if_id][subphy_num][vref_idx] :
+						dq_vref_start_win[if_id][subphy_num][vref_idx] - 1;
+					dq_vref_end_win[if_id][subphy_num][vref_idx] =
+						c_opt_per_bus[if_id][subphy_num] +
+						valid_win_size[if_id][subphy_num] / 2;
+					vref_state_per_subphy[if_id][subphy_num] = MV_DDR4_VREF_SUBPHY_CAL_INSIDE;
+				} else if (vref_state_per_subphy[if_id][subphy_num] == MV_DDR4_VREF_SUBPHY_CAL_INSIDE) {
+					vref_state_per_subphy[if_id][subphy_num] = MV_DDR4_VREF_SUBPHY_CAL_UNDER;
+				}
+			} /* subphy */
+		} /* if */
+
+		/* check all subphys are in under state */
+		all_subphys_state = MV_DDR4_VREF_SUBPHY_CAL_UNDER;
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				if (vref_state_per_subphy[if_id][subphy_num] != MV_DDR4_VREF_SUBPHY_CAL_UNDER)
+					all_subphys_state = MV_DDR4_VREF_SUBPHY_CAL_INSIDE;
+			}
+		}
+	}
+
+	if (tap_tune_passed == 0) {
+		DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+				  ("%s: tap tune not passed on any dq_vref value\n", __func__));
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			/* report fail for all active interfaces; multi-interface support - tbd */
+			flow_result[if_id] = TEST_FAILED;
+		}
+
+		return MV_FAIL;
+	}
+
+	/* close vref range */
+	mv_ddr4_vref_tap_set(dev_num, 0, ACCESS_TYPE_MULTICAST, vref_tap_idx, MV_DDR4_VREF_TAP_END);
+
+	/*
+	 * find out the results with the mixed and low states and move the low state 64 adlls in case
+	 * the center of the ui is smaller than 31
+	 */
+	for (vref_idx = 0; vref_idx < MV_DDR4_VREF_MAX_RANGE; vref_idx++) {
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				if (((vref_byte_status[if_id][subphy_num][vref_idx]) &
+				    (BYTE_HOMOGENEOUS_LOW | BYTE_SPLIT_OUT_MIX)) ==
+				    (BYTE_HOMOGENEOUS_LOW | BYTE_SPLIT_OUT_MIX)) {
+					if ((dq_vref_start_win[if_id][subphy_num][vref_idx] +
+					    dq_vref_end_win[if_id][subphy_num][vref_idx]) / 2 <= 31) {
+						dq_vref_start_win[if_id][subphy_num][vref_idx] += 64;
+						dq_vref_end_win[if_id][subphy_num][vref_idx] += 64;
+						DEBUG_CALIBRATION
+							(DEBUG_LEVEL_TRACE,
+							 ("%s vref_idx %d if %d subphy %d added 64 adlls to window\n",
+							  __func__, valid_vref_ptr[if_id][subphy_num][vref_idx],
+							  if_id, subphy_num));
+					}
+				}
+			}
+		}
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+					  ("calculating center of mass for subphy %d, valid window size %d\n",
+					   subphy_num, valid_win_size[if_id][subphy_num]));
+			if (valid_vref_cnt[if_id][subphy_num] > 0) {
+				/* calculate center of mass sampling point (t, v) for each subphy */
+				status = mv_ddr4_center_of_mass_calc(dev_num, if_id, subphy_num, TX_DIR,
+								     dq_vref_start_win[if_id][subphy_num],
+								     dq_vref_end_win[if_id][subphy_num],
+								     valid_vref_ptr[if_id][subphy_num],
+								     valid_vref_cnt[if_id][subphy_num],
+								     &center_vref[if_id][subphy_num],
+								     &center_adll[if_id][subphy_num]);
+				if (status != MV_OK)
+					return status;
+
+				DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+						  ("center of mass results: vref %d, adll %d\n",
+						   center_vref[if_id][subphy_num], center_adll[if_id][subphy_num]));
+			} else {
+				DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+						  ("%s subphy %d no vref results to calculate the center of mass\n",
+						  __func__, subphy_num));
+				status = MV_ERROR;
+				return status;
+			}
+		}
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		vref_avg = 0;
+		vref_subphy_num = 0;
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			vref_avg += center_vref[if_id][subphy_num];
+			dq_vref_vec[subphy_num] = center_vref[if_id][subphy_num];
+			vref_subphy_num++;
+		}
+
+		mv_ddr4_vref_tap_set(dev_num, if_id, ACCESS_TYPE_UNICAST,
+				     vref_avg / vref_subphy_num, MV_DDR4_VREF_TAP_START);
+		mv_ddr4_vref_tap_set(dev_num, if_id, ACCESS_TYPE_UNICAST,
+				     vref_avg / vref_subphy_num, MV_DDR4_VREF_TAP_END);
+		DEBUG_CALIBRATION(DEBUG_LEVEL_INFO, ("final vref average %d\n", vref_avg / vref_subphy_num));
+		/* run centralization again with optimal vref to update global structures */
+		mv_ddr4_centralization(dev_num, pbs_tap_factor, c_opt_per_bus, pbs_res_per_bus, valid_win_size,
+				       TX_DIR, vref_avg / vref_subphy_num, duty_cycle);
+	}
+
+	/* return dram from vref DRAM from vref training mode */
+	mv_ddr4_vref_training_mode_ctrl(dev_num, 0, ACCESS_TYPE_MULTICAST, 0);
+
+	/* dqs tx reposition calculation */
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+				pad_num = dq_map_table[dq_idx +
+						       subphy_num * BUS_WIDTH_IN_BITS +
+						       if_id * BUS_WIDTH_IN_BITS * subphy_max];
+				status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id, ACCESS_TYPE_UNICAST,
+							    subphy_num, DDR_PHY_DATA,
+							    0x10 + pad_num + effective_cs * 0x10,
+							    pbs_res_per_bus[if_id][subphy_num][dq_idx]);
+				if (status != MV_OK)
+					return status;
+			}
+
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id, ACCESS_TYPE_UNICAST,
+						    subphy_num, DDR_PHY_DATA,
+						    CTX_PHY_REG(effective_cs),
+						    center_adll[if_id][subphy_num] % 64);
+			if (status != MV_OK)
+				return status;
+		}
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		/* report pass for all active interfaces; multi-interface support - tbd */
+		flow_result[if_id] = TEST_SUCCESS;
+	}
+
+	return MV_OK;
+}
+
+/* centralization flow */
+static int mv_ddr4_centralization(u8 dev_num, u16 (*lambda)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS], u8 (*copt)[MAX_BUS_NUM],
+				  u8 (*pbs_result)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS], u8 (*vw_size)[MAX_BUS_NUM],
+				  u8 mode, u16 param0, u8 param1)
+{
+/* FIXME:  remove the dependency in 64bit */
+#define MV_DDR_NUM_OF_CENTRAL_PATTERNS	(PATTERN_KILLER_DQ7 - PATTERN_KILLER_DQ0 + 1)
+	static u8 subphy_end_win[MAX_DIR_TYPES][MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	static u8 subphy_start_win[MAX_DIR_TYPES][MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	static u8 final_start_win[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	static u8 final_end_win[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	enum hws_training_ip_stat training_result[MAX_INTERFACE_NUM];
+	u32 if_id, subphy_num, pattern_id, pattern_loop_idx, bit_num;
+	u8  curr_start_win[BUS_WIDTH_IN_BITS];
+	u8  curr_end_win[BUS_WIDTH_IN_BITS];
+	static u8 start_win_db[MV_DDR_NUM_OF_CENTRAL_PATTERNS][MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	static u8 end_win_db[MV_DDR_NUM_OF_CENTRAL_PATTERNS][MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	u8  curr_win[BUS_WIDTH_IN_BITS];
+	u8  opt_win, waste_win, start_win_skew, end_win_skew;
+	u8  final_subphy_win[MAX_INTERFACE_NUM][BUS_WIDTH_IN_BITS];
+	enum hws_training_result result_type = RESULT_PER_BIT;
+	enum hws_dir direction;
+	enum hws_search_dir search_dir;
+	u32 *result[HWS_SEARCH_DIR_LIMIT];
+	u32 max_win_size;
+	u8 curr_end_win_min, curr_start_win_max;
+	u32 cs_ena_reg_val[MAX_INTERFACE_NUM];
+	u8 current_byte_status;
+	int status;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	u8 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		/* save current cs enable reg val */
+		status = ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, DUAL_DUNIT_CFG_REG,
+					  cs_ena_reg_val, MASK_ALL_BITS);
+		if (status != MV_OK)
+			return status;
+
+		/* enable single cs */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DUAL_DUNIT_CFG_REG,
+					   (0x1 << 3), (0x1 << 3));
+		if (status != MV_OK)
+			return status;
+	}
+
+	if (mode == TX_DIR) {
+		max_win_size = MAX_WINDOW_SIZE_TX;
+		direction = OPER_WRITE;
+	} else {
+		max_win_size = MAX_WINDOW_SIZE_RX;
+		direction = OPER_READ;
+	}
+
+	/* database initialization */
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			patterns_byte_status[if_id][subphy_num] = BYTE_NOT_DEFINED;
+			subphy_end_win[mode][if_id][subphy_num] = (max_win_size - 1);
+			subphy_start_win[mode][if_id][subphy_num] = 0;
+			vw_size[if_id][subphy_num] = (max_win_size - 1);
+			for (bit_num = 0; bit_num < BUS_WIDTH_IN_BITS; bit_num++) {
+				final_start_win[if_id][subphy_num][bit_num] = 0;
+				final_end_win[if_id][subphy_num][bit_num] = (max_win_size - 1);
+				if (mode == TX_DIR)
+					final_end_win[if_id][subphy_num][bit_num] = (2 * max_win_size - 1);
+			}
+			if (mode == TX_DIR) {
+				subphy_end_win[mode][if_id][subphy_num] = (2 * max_win_size - 1);
+				vw_size[if_id][subphy_num] = (2 * max_win_size - 1);
+			}
+		}
+	}
+
+	/* main flow */
+	/* FIXME: hard-coded "22" below for PATTERN_KILLER_DQ7_64 enum hws_pattern */
+	for (pattern_id = PATTERN_KILLER_DQ0, pattern_loop_idx = 0;
+	     pattern_id <= (MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask) ? 22 : PATTERN_KILLER_DQ7);
+	     pattern_id++, pattern_loop_idx++) {
+		ddr3_tip_ip_training_wrapper(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+					     PARAM_NOT_CARE, result_type, HWS_CONTROL_ELEMENT_ADLL,
+					     PARAM_NOT_CARE, direction, tm->if_act_mask,
+					     0x0, max_win_size - 1, max_win_size - 1, pattern_id,
+					     EDGE_FPF, CS_SINGLE, PARAM_NOT_CARE, training_result);
+
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				/*
+				 * in case the previous patterns found the current subphy as BYTE_NOT_DEFINED,
+				 * continue to next subphy
+				 */
+				if ((patterns_byte_status[if_id][subphy_num] == BYTE_NOT_DEFINED) &&
+				    (pattern_id != PATTERN_KILLER_DQ0))
+					continue;
+				/*
+				 * in case the result of the current subphy is BYTE_NOT_DEFINED mark the
+				 * pattern byte status as BYTE_NOT_DEFINED
+				 */
+				current_byte_status = mv_ddr_tip_sub_phy_byte_status_get(if_id, subphy_num);
+				if (current_byte_status == BYTE_NOT_DEFINED) {
+					DEBUG_DDR4_CENTRALIZATION
+						(DEBUG_LEVEL_INFO,
+						 ("%s:%s: failed to lock subphy, pat %d if %d subphy %d\n",
+						 __func__, str_dir[mode], pattern_id, if_id, subphy_num));
+					patterns_byte_status[if_id][subphy_num] = BYTE_NOT_DEFINED;
+					/* update the valid window size which is return value from this function */
+					vw_size[if_id][subphy_num] = 0;
+					/* continue to next subphy */
+					continue;
+				}
+
+				/* set the status of this byte */
+				patterns_byte_status[if_id][subphy_num] |= current_byte_status;
+				for (search_dir = HWS_LOW2HIGH; search_dir <= HWS_HIGH2LOW; search_dir++) {
+					status = ddr3_tip_read_training_result(dev_num, if_id, ACCESS_TYPE_UNICAST,
+									       subphy_num, ALL_BITS_PER_PUP,
+									       search_dir, direction, result_type,
+									       TRAINING_LOAD_OPERATION_UNLOAD,
+									       CS_SINGLE, &result[search_dir],
+									       1, 0, 0);
+					if (status != MV_OK)
+						return status;
+
+					DEBUG_DDR4_CENTRALIZATION
+					(DEBUG_LEVEL_INFO,
+					 ("param0 %d param1 %d pat %d if %d subphy %d "
+					 "regs: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+					 param0, param1, pattern_id, if_id, subphy_num,
+					 result[search_dir][0], result[search_dir][1],
+					 result[search_dir][2], result[search_dir][3],
+					 result[search_dir][4], result[search_dir][5],
+					 result[search_dir][6], result[search_dir][7]));
+				}
+
+				for (bit_num = 0; bit_num < BUS_WIDTH_IN_BITS; bit_num++) {
+					/* read result success */
+					DEBUG_DDR4_CENTRALIZATION(
+								  DEBUG_LEVEL_INFO,
+								  ("%s %s subphy locked, pat %d if %d subphy %d\n",
+								  __func__, str_dir[mode], pattern_id,
+								  if_id, subphy_num));
+					start_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] =
+						GET_TAP_RESULT(result[HWS_LOW2HIGH][bit_num], EDGE_1);
+					end_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] =
+						GET_TAP_RESULT(result[HWS_HIGH2LOW][bit_num], EDGE_1);
+				}
+			} /* subphy */
+		} /* interface */
+	} /* pattern */
+
+	/*
+	 * check if the current patterns subphys in all interfaces has mixed and low byte states
+	 * in that case add 64 adlls to the low byte
+	 */
+	for (pattern_id = PATTERN_KILLER_DQ0, pattern_loop_idx = 0;
+		pattern_id <= (MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask) ? 22 : PATTERN_KILLER_DQ7);
+		pattern_id++, pattern_loop_idx++) {
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				if (patterns_byte_status[if_id][subphy_num] == BYTE_NOT_DEFINED)
+					continue;
+				opt_win = 2 * max_win_size;	/* initialize opt_win */
+				/* in case this byte in the pattern is homogeneous low add 64 adlls to the byte */
+				if (((patterns_byte_status[if_id][subphy_num]) &
+				    (BYTE_HOMOGENEOUS_LOW | BYTE_SPLIT_OUT_MIX)) ==
+				     (BYTE_HOMOGENEOUS_LOW | BYTE_SPLIT_OUT_MIX)) {
+					for (bit_num = 0; bit_num < BUS_WIDTH_IN_BITS; bit_num++) {
+						if (start_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] <= 31 &&
+						    end_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] <= 31) {
+							start_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] +=
+								64;
+							end_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] += 64;
+							DEBUG_DDR4_CENTRALIZATION
+								(DEBUG_LEVEL_INFO,
+								 ("%s %s pattern %d if %d subphy %d bit %d added 64 "
+								 "adll\n",
+								 __func__, str_dir[mode], pattern_id, if_id,
+								 subphy_num, bit_num));
+						}
+					}
+				}
+
+				/* calculations for the current pattern per subphy */
+				for (bit_num = 0; bit_num < BUS_WIDTH_IN_BITS; bit_num++) {
+					curr_win[bit_num] = end_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] -
+						start_win_db[pattern_loop_idx][if_id][subphy_num][bit_num] + 1;
+					curr_start_win[bit_num] =
+						start_win_db[pattern_loop_idx][if_id][subphy_num][bit_num];
+					curr_end_win[bit_num] =
+						end_win_db[pattern_loop_idx][if_id][subphy_num][bit_num];
+				}
+
+				opt_win = GET_MIN(opt_win, ddr3_tip_get_buf_min(curr_win));
+				vw_size[if_id][subphy_num] =
+					GET_MIN(vw_size[if_id][subphy_num], ddr3_tip_get_buf_min(curr_win));
+
+				/* final subphy window length */
+				final_subphy_win[if_id][subphy_num] = ddr3_tip_get_buf_min(curr_end_win) -
+					ddr3_tip_get_buf_max(curr_start_win) + 1;
+				waste_win = opt_win - final_subphy_win[if_id][subphy_num];
+				start_win_skew = ddr3_tip_get_buf_max(curr_start_win) -
+					ddr3_tip_get_buf_min(curr_start_win);
+				end_win_skew = ddr3_tip_get_buf_max(curr_end_win) -
+					ddr3_tip_get_buf_min(curr_end_win);
+
+				/* min/max updated with pattern change */
+				curr_end_win_min = ddr3_tip_get_buf_min(curr_end_win);
+				curr_start_win_max = ddr3_tip_get_buf_max(curr_start_win);
+				subphy_end_win[mode][if_id][subphy_num] =
+					GET_MIN(subphy_end_win[mode][if_id][subphy_num], curr_end_win_min);
+				subphy_start_win[mode][if_id][subphy_num] =
+					GET_MAX(subphy_start_win[mode][if_id][subphy_num], curr_start_win_max);
+				DEBUG_DDR4_CENTRALIZATION
+					(DEBUG_LEVEL_TRACE,
+					 ("%s, %s pat %d if %d subphy %d opt_win %d ",
+					 __func__, str_dir[mode], pattern_id, if_id, subphy_num, opt_win));
+				DEBUG_DDR4_CENTRALIZATION
+					(DEBUG_LEVEL_TRACE,
+					 ("final_subphy_win %d waste_win %d "
+					 "start_win_skew %d end_win_skew %d ",
+					 final_subphy_win[if_id][subphy_num],
+					 waste_win, start_win_skew, end_win_skew));
+				DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+					("curr_start_win_max %d curr_end_win_min %d "
+					"subphy_start_win %d subphy_end_win %d\n",
+					curr_start_win_max, curr_end_win_min,
+					subphy_start_win[mode][if_id][subphy_num],
+					subphy_end_win[mode][if_id][subphy_num]));
+
+				/* valid window */
+				DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+					("valid window, pat %d if %d subphy %d\n",
+					pattern_id, if_id, subphy_num));
+				for (bit_num = 0; bit_num < BUS_WIDTH_IN_BITS; bit_num++) {
+					final_start_win[if_id][subphy_num][bit_num] =
+						GET_MAX(final_start_win[if_id][subphy_num][bit_num],
+							curr_start_win[bit_num]);
+					final_end_win[if_id][subphy_num][bit_num] =
+						GET_MIN(final_end_win[if_id][subphy_num][bit_num],
+							curr_end_win[bit_num]);
+				} /* bit */
+			} /* subphy */
+		} /* if_id */
+	} /* pattern */
+
+	/* calculate valid window for each subphy */
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			if (patterns_byte_status[if_id][subphy_num] != BYTE_NOT_DEFINED) {
+				/*
+				 * in case of bytes status which were found as mixed and low
+				 * change the their status to be mixed only, due to the fact
+				 * that we have already dealt with this bytes by adding 64 adlls
+				 * to the low bytes
+				 */
+				if (patterns_byte_status[if_id][subphy_num] &
+				    (BYTE_HOMOGENEOUS_LOW | BYTE_SPLIT_OUT_MIX))
+					patterns_byte_status[if_id][subphy_num] = BYTE_SPLIT_OUT_MIX;
+				if (rx_vw_pos[if_id][subphy_num] == 0)	/* rx_vw_pos is initialized during tap tune */
+					pbs_max = 31 - 0xa;
+				else
+					pbs_max = 31;
+
+				/* continue if locked */
+				/*if (centralization_state[if_id][subphy_num] == 0) {*/
+				status = mv_ddr4_copt_get(mode, lambda[if_id][subphy_num],
+							  final_start_win[if_id][subphy_num],
+							  final_end_win[if_id][subphy_num],
+							  pbs_result[if_id][subphy_num],
+							  &copt[if_id][subphy_num]);
+
+				/*
+				 * after copt the adll is moved to smaller value due to pbs compensation
+				 * so the byte status might change, here we change the byte status to be
+				 * homogeneous low in case the center of the ui after copt is moved below
+				 * 31 adlls
+				 */
+				if(copt[if_id][subphy_num] <= 31)
+					patterns_byte_status[if_id][subphy_num] = BYTE_HOMOGENEOUS_LOW;
+
+				DEBUG_DDR4_CENTRALIZATION
+					(DEBUG_LEVEL_INFO,
+					 ("%s %s if %d subphy %d copt %d\n",
+					 __func__, str_dir[mode], if_id, subphy_num, copt[if_id][subphy_num]));
+
+				if (status != MV_OK) {
+					/*
+					 * TODO: print out error message(s) only when all points fail
+					 * as temporary solution, replaced ERROR to TRACE debug level
+					 */
+					DEBUG_DDR4_CENTRALIZATION
+						(DEBUG_LEVEL_TRACE,
+						 ("%s %s copt calculation failed, "
+						 "no valid window for subphy %d\n",
+						 __func__, str_dir[mode], subphy_num));
+					/* set the byte to 0 (fail) and clean the status (continue with algorithm) */
+					vw_size[if_id][subphy_num] = 0;
+					status = MV_OK;
+
+					if (debug_mode == 0) {
+						/*
+						 * TODO: print out error message(s) only when all points fail
+						 * as temporary solution, commented out debug level set to TRACE
+						*/
+						/*
+						 * ddr3_hws_set_log_level(DEBUG_BLOCK_CALIBRATION, DEBUG_LEVEL_TRACE);
+						 */
+						/* open relevant log and run function again for debug */
+						mv_ddr4_copt_get(mode, lambda[if_id][subphy_num],
+									final_start_win[if_id][subphy_num],
+									final_end_win[if_id][subphy_num],
+									pbs_result[if_id][subphy_num],
+									&copt[if_id][subphy_num]);
+						/*
+						 * ddr3_hws_set_log_level(DEBUG_BLOCK_CALIBRATION, DEBUG_LEVEL_ERROR);
+						 */
+					} /* debug mode */
+				} /* status */
+			} /* byte not defined */
+		} /* subphy */
+	} /* if_id */
+
+	/* restore cs enable value*/
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM - 1; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, if_id, DUAL_DUNIT_CFG_REG,
+					   cs_ena_reg_val[if_id], MASK_ALL_BITS);
+		if (status != MV_OK)
+			return status;
+	}
+
+	return status;
+}
+
+/*
+ * mv_ddr4_copt_get function
+ * inputs:
+ *	dir - direction; 0 is for rx, 1 for tx
+ *	lambda - a pointer to adll to pbs ration multiplied by PBS_VAL_FACTOR
+ *	vw_l - a pointer to valid window low limit in adll taps
+ *	vw_h - a pointer to valid window high limit in adll taps
+ * outputs:
+ *	pbs_result - a pointer to pbs new delay value; the function's output
+ *	copt - optimal center of subphy in adll taps
+ * The function assumes initial pbs tap value is zero. Otherwise, it requires logic
+ * getting pbs value per dq and setting pbs_taps_per_dq array.
+ * It provides with a solution for a single subphy (8 bits).
+ * The calling function is responsible for any additional pbs taps for dqs
+ */
+static int mv_ddr4_copt_get(u8 dir, u16 *lambda, u8 *vw_l, u8 *vw_h, u8 *pbs_result, u8 *copt)
+{
+	u8 center_per_dq[8];
+	u8 center_zone_low[8] = {0};
+	u8 center_zone_high[8] = {0};
+	u8 ext_center_zone_low[8] = {0};
+	u8 ext_center_zone_high[8] = {0};
+	u8 pbs_taps_per_dq[8] = {0};
+	u8 vw_per_dq[8];
+	u8 vw_zone_low[8] = {0};
+	u8 vw_zone_high[8] = {0};
+	u8 margin_vw[8] = {0};
+	u8 copt_val;
+	u8 dq_idx;
+	u8 center_zone_max_low = 0;
+	u8 center_zone_min_high = 128;
+	u8 vw_zone_max_low = 0;
+	u8 vw_zone_min_high = 128;
+	u8 min_vw = 63; /* minimum valid window between all bits */
+	u8 center_low_el;
+	u8 center_high_el;
+
+	/* lambda calculated as D * PBS_VALUE_FACTOR / d */
+	//printf("Copt::Debug::\t");
+	for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+		center_per_dq[dq_idx] = (vw_h[dq_idx] + vw_l[dq_idx]) / 2;
+		vw_per_dq[dq_idx] = 1 + (vw_h[dq_idx] - vw_l[dq_idx]);
+		if (min_vw > vw_per_dq[dq_idx])
+			min_vw = vw_per_dq[dq_idx];
+	}
+
+	/* calculate center zone */
+	for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+		center_low_el = center_low_element_get(dir, pbs_taps_per_dq[dq_idx], lambda[dq_idx], pbs_max);
+		if (center_per_dq[dq_idx] > center_low_el)
+			center_zone_low[dq_idx] = center_per_dq[dq_idx] - center_low_el;
+		center_high_el = center_high_element_get(dir, pbs_taps_per_dq[dq_idx], lambda[dq_idx], pbs_max);
+		center_zone_high[dq_idx] = center_per_dq[dq_idx] + center_high_el;
+		if (center_zone_max_low < center_zone_low[dq_idx])
+			center_zone_max_low = center_zone_low[dq_idx];
+		if (center_zone_min_high > center_zone_high[dq_idx])
+			center_zone_min_high = center_zone_high[dq_idx];
+		DEBUG_CALIBRATION(DEBUG_LEVEL_TRACE,
+				  ("center: low %d, high %d, max_low %d, min_high %d\n",
+				   center_zone_low[dq_idx], center_zone_high[dq_idx],
+				   center_zone_max_low, center_zone_min_high));
+	}
+
+	if (center_zone_min_high >= center_zone_max_low) { /* center zone visib */
+		/* set copt_val to high zone for rx */
+		copt_val = (dir == RX_DIR) ? center_zone_max_low : center_zone_min_high;
+		*copt = copt_val;
+
+		/* calculate additional pbs taps */
+		for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+			if (dir == RX_DIR)
+				pbs_result[dq_idx] = (copt_val - center_per_dq[dq_idx]) *
+						     PBS_VAL_FACTOR / lambda[dq_idx];
+			else
+				pbs_result[dq_idx] = (center_per_dq[dq_idx] - copt_val) *
+						     PBS_VAL_FACTOR / lambda[dq_idx];
+		}
+		return MV_OK;
+	} else { /* not center zone visib */
+		for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+			if ((center_zone_low[dq_idx] + 1) > (vw_per_dq[dq_idx] / 2  + vw_per_dq[dq_idx] % 2)) {
+				vw_zone_low[dq_idx] = (center_zone_low[dq_idx] + 1) -
+						      (vw_per_dq[dq_idx] / 2 + vw_per_dq[dq_idx] % 2);
+			} else {
+				vw_zone_low[dq_idx] = 0;
+				DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+						  ("dq_idx %d, center zone low %d, vw_l %d, vw_l %d\n",
+						   dq_idx, center_zone_low[dq_idx], vw_l[dq_idx], vw_h[dq_idx]));
+				DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+						  ("vw_l[%d], vw_lh[%d], lambda[%d]\n",
+						   vw_l[dq_idx], vw_h[dq_idx], lambda[dq_idx]));
+			}
+
+			vw_zone_high[dq_idx] = center_zone_high[dq_idx] + vw_per_dq[dq_idx] / 2;
+
+			if (vw_zone_max_low < vw_zone_low[dq_idx])
+				vw_zone_max_low = vw_zone_low[dq_idx];
+
+			if (vw_zone_min_high > vw_zone_high[dq_idx])
+				vw_zone_min_high = vw_zone_high[dq_idx];
+
+			DEBUG_CALIBRATION(DEBUG_LEVEL_TRACE,
+					  ("valid_window: low %d, high %d, max_low %d, min_high %d\n",
+					   vw_zone_low[dq_idx], vw_zone_high[dq_idx],
+					   vw_zone_max_low, vw_zone_min_high));
+		}
+
+		/* try to extend center zone */
+		if (vw_zone_min_high >= vw_zone_max_low) { /* vw zone visib */
+			center_zone_max_low = 0;
+			center_zone_min_high = 128;
+
+			for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+				margin_vw[dq_idx] =  vw_per_dq[dq_idx] - min_vw;
+
+				if (center_zone_low[dq_idx] > margin_vw[dq_idx])
+					ext_center_zone_low[dq_idx] = center_zone_low[dq_idx] - margin_vw[dq_idx];
+				else
+					ext_center_zone_low[dq_idx] = 0;
+
+				ext_center_zone_high[dq_idx] = center_zone_high[dq_idx] + margin_vw[dq_idx];
+
+				if (center_zone_max_low < ext_center_zone_low[dq_idx])
+					center_zone_max_low = ext_center_zone_low[dq_idx];
+
+				if (center_zone_min_high > ext_center_zone_high[dq_idx])
+					center_zone_min_high = ext_center_zone_high[dq_idx];
+
+				DEBUG_CALIBRATION(DEBUG_LEVEL_TRACE,
+						  ("ext_center: low %d, high %d, max_low %d, min_high %d\n",
+						   ext_center_zone_low[dq_idx], ext_center_zone_high[dq_idx],
+						   center_zone_max_low, center_zone_min_high));
+			}
+
+			if (center_zone_min_high >= center_zone_max_low) { /* center zone visib */
+				/* get optimal center position */
+				copt_val = (dir == RX_DIR) ? center_zone_max_low : center_zone_min_high;
+				*copt = copt_val;
+
+				/* calculate additional pbs taps */
+				for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+					if (dir == 0) {
+						if (copt_val > center_per_dq[dq_idx])
+							pbs_result[dq_idx] = (copt_val - center_per_dq[dq_idx]) *
+									     PBS_VAL_FACTOR / lambda[dq_idx];
+						else
+							pbs_result[dq_idx] = 0;
+					} else {
+						if (center_per_dq[dq_idx] > copt_val)
+							pbs_result[dq_idx] = (center_per_dq[dq_idx] - copt_val) *
+									     PBS_VAL_FACTOR / lambda[dq_idx];
+						else
+							pbs_result[dq_idx] = 0;
+					}
+
+					if (pbs_result[dq_idx] > pbs_max)
+						pbs_result[dq_idx] = pbs_max;
+				}
+
+				return MV_OK;
+			} else { /* not center zone visib */
+				/*
+				 * TODO: print out error message(s) only when all points fail
+				 * as temporary solution, replaced ERROR to TRACE debug level
+				*/
+				DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+							  ("lambda: %d, %d, %d, %d, %d, %d, %d, %d\n",
+							   lambda[0], lambda[1], lambda[2], lambda[3],
+							   lambda[4], lambda[5], lambda[6], lambda[7]));
+
+				DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+							  ("vw_h: %d, %d, %d, %d, %d, %d, %d, %d\n",
+							   vw_h[0], vw_h[1], vw_h[2], vw_h[3],
+							   vw_h[4], vw_h[5], vw_h[6], vw_h[7]));
+
+				DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+							  ("vw_l: %d, %d, %d, %d, %d, %d, %d, %d\n",
+							   vw_l[0], vw_l[1], vw_l[2], vw_l[3],
+							   vw_l[4], vw_l[5], vw_l[6], vw_l[7]));
+
+				for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+					DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+								  ("center: low %d, high %d, "
+								   "max_low %d, min_high %d\n",
+								   center_zone_low[dq_idx], center_zone_high[dq_idx],
+								   center_zone_max_low, center_zone_min_high));
+
+					DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+								  ("valid_window: low %d, high %d, "
+								   "max_low %d, min_high %d\n",
+								   vw_zone_low[dq_idx], vw_zone_high[dq_idx],
+								   vw_zone_max_low, vw_zone_min_high));
+
+					DEBUG_DDR4_CENTRALIZATION(DEBUG_LEVEL_TRACE,
+								  ("ext_center: low %d, high %d, "
+								   "max_low %d, min_high %d\n",
+								   ext_center_zone_low[dq_idx],
+								   ext_center_zone_high[dq_idx],
+								   center_zone_max_low, center_zone_min_high));
+				}
+
+				return MV_FAIL;
+			}
+		} else { /* not vw zone visib; failed to find a single sample point */
+			return MV_FAIL;
+		}
+	}
+
+	return MV_OK;
+}
+
+/*
+ * mv_ddr4_dqs_reposition function gets copt to align to and returns pbs value per bit
+ * parameters:
+ *	dir - direction; 0 is for rx, 1 for tx
+ *	lambda - a pointer to adll to pbs ration multiplied by PBS_VAL_FACTOR
+ *	pbs_result - a pointer to pbs new delay value; the function's output
+ *	delta - signed; possilbe values: +0xa, 0x0, -0xa; for rx can be only negative
+ *	copt - optimal center of subphy in adll taps
+ *	dqs_pbs - optimal pbs
+ * The function assumes initial pbs tap value is zero. Otherwise, it requires logic
+ * getting pbs value per dq and setting pbs_taps_per_dq array.
+ * It provides with a solution for a single subphy (8 bits).
+ * The calling function is responsible for any additional pbs taps for dqs
+ */
+static int mv_ddr4_dqs_reposition(u8 dir, u16 *lambda, u8 *pbs_result, char delta, u8 *copt, u8 *dqs_pbs)
+{
+	u8 dq_idx;
+	u32 pbs_max_val = 0;
+	u32 lambda_avg = 0;
+
+	/* lambda calculated as D * X / d */
+	for (dq_idx = 0; dq_idx < 8; dq_idx++) {
+		if (pbs_max_val < pbs_result[dq_idx])
+			pbs_max_val = pbs_result[dq_idx];
+		lambda_avg += lambda[dq_idx];
+	}
+
+	if (delta >= 0)
+		*dqs_pbs = (pbs_max_val + delta) / 2;
+	else /* dqs already 0xa */
+		*dqs_pbs = pbs_max_val / 2;
+
+	lambda_avg /= 8;
+
+	/* change in dqs pbs value requires change in final copt position from mass center solution */
+	if (dir == TX_DIR) {
+		/* for tx, additional pbs on dqs in opposite direction of adll */
+		*copt = *copt + ((*dqs_pbs) * lambda_avg) / PBS_VAL_FACTOR;
+	} else {
+		/* for rx, additional pbs on dqs in same direction of adll */
+		if (delta < 0)
+			*copt = *copt - ((*dqs_pbs + delta) * lambda_avg) / PBS_VAL_FACTOR;
+		else
+			*copt = *copt - (*dqs_pbs * lambda_avg) / PBS_VAL_FACTOR;
+	}
+
+	return MV_OK;
+}
+
+/*
+ * mv_ddr4_center_of_mass_calc function
+ * parameters:
+ *	vw_l - a pointer to valid window low limit in adll taps
+ *	vw_h - a pointer to valid window high limit in adll taps
+ *	vw_v - a pointer to vref value matching vw_l/h arrays
+ *	vw_num - number of valid windows (lenght vw_v vector)
+ *	v_opt - optimal voltage value in vref taps
+ *	t_opt - optimal adll value in adll taps
+ * This function solves 2D centroid equation (e.g., adll and vref axes)
+ * The function doesn't differentiate between byte and bit eyes
+ */
+static int mv_ddr4_center_of_mass_calc(u8 dev_num, u8 if_id, u8 subphy_num, u8 mode, u8 *vw_l,
+				       u8 *vw_h, u8 *vw_v, u8 vw_num, u8 *v_opt, u8 *t_opt)
+{
+	u8 idx;
+	u8 edge_t[128], edge_v[128];
+	u8 min_edge_t = 127, min_edge_v = 127;
+	int polygon_area = 0;
+	int t_opt_temp = 0, v_opt_temp = 0;
+	int vw_avg = 0, v_avg = 0;
+	int s0 = 0, s1 = 0, s2 = 0, slope = 1, r_sq = 0;
+	u32 d_min = 10000, reg_val = 0;
+	int status;
+
+	/*
+	 * reorder all polygon points counterclockwise
+	 * get min value of each axis to shift to smaller calc value
+	 */
+	 for (idx = 0; idx < vw_num; idx++) {
+		edge_t[idx] = vw_l[idx];
+		edge_v[idx] = vw_v[idx];
+		if (min_edge_v > vw_v[idx])
+			min_edge_v = vw_v[idx];
+		if (min_edge_t > vw_l[idx])
+			min_edge_t = vw_l[idx];
+		edge_t[vw_num * 2 - 1 - idx] = vw_h[idx];
+		edge_v[vw_num * 2 - 1 - idx] = vw_v[idx];
+		vw_avg += vw_h[idx] - vw_l[idx];
+		v_avg += vw_v[idx];
+		DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+				  ("%s: if %d, byte %d, direction %d, vw_v %d, vw_l %d, vw_h %d\n",
+				   __func__, if_id, subphy_num, mode, vw_v[idx], vw_l[idx], vw_h[idx]));
+	}
+
+	vw_avg *= 1000 / vw_num;
+	v_avg /= vw_num;
+	for (idx = 0; idx < vw_num; idx++) {
+		s0 += (1000 * (vw_h[idx] - vw_l[idx]) - vw_avg) * (vw_v[idx] - v_avg);
+		s1 += (vw_v[idx] - v_avg) * (vw_v[idx] - v_avg);
+		s2 += (1000 * (vw_h[idx] - vw_l[idx]) - vw_avg) * (1000 * (vw_h[idx] - vw_l[idx]) - vw_avg);
+	}
+	r_sq = s0 * (s0 / s1);
+	r_sq /= (s2 / 1000);
+	slope = s0 / s1;
+
+	/* idx n is equal to idx 0 */
+	edge_t[vw_num * 2] = vw_l[0];
+	edge_v[vw_num * 2] = vw_v[0];
+
+	/* calculate polygon area, a (may be negative) */
+	for (idx = 0; idx < vw_num * 2; idx++)
+		polygon_area = polygon_area +
+			       ((edge_t[idx] - min_edge_t)*(edge_v[idx + 1] - min_edge_v) -
+			       (edge_t[idx + 1] - min_edge_t)*(edge_v[idx] - min_edge_v));
+
+	/* calculate optimal point */
+	for (idx = 0; idx < vw_num * 2; idx++) {
+		t_opt_temp = t_opt_temp +
+			     (edge_t[idx] + edge_t[idx + 1] - 2 * min_edge_t) *
+			     ((edge_t[idx] - min_edge_t)*(edge_v[idx + 1] - min_edge_v) -
+			      (edge_t[idx + 1] - min_edge_t)*(edge_v[idx] - min_edge_v));
+		v_opt_temp = v_opt_temp +
+			     (edge_v[idx] + edge_v[idx + 1] - 2 * min_edge_v) *
+			     ((edge_t[idx] - min_edge_t)*(edge_v[idx + 1] - min_edge_v) -
+			      (edge_t[idx + 1] - min_edge_t)*(edge_v[idx] - min_edge_v));
+	}
+
+	*t_opt = t_opt_temp / (3 * polygon_area);
+	*v_opt = v_opt_temp / (3 * polygon_area);
+
+	/* re-shift */
+	*t_opt += min_edge_t;
+	*v_opt += min_edge_v;
+
+	/* calculate d_min */
+	for (idx = 0; idx < 2 * vw_num; idx++) {
+		s0 = (*t_opt - edge_t[idx]) * (*t_opt - edge_t[idx]) +
+		     (*v_opt - edge_v[idx]) * (*v_opt - edge_v[idx]);
+		d_min = (d_min > s0) ? s0 : d_min;
+	}
+	DEBUG_CALIBRATION(DEBUG_LEVEL_TRACE,
+			  ("%s: r_sq %d, slope %d, area = %d, , d_min = %d\n",
+			   __func__, r_sq, slope, polygon_area, d_min));
+
+	/* insert vw eye to register database for validation */
+	if (d_min < 0)
+		d_min = -d_min;
+	if (polygon_area < 0)
+		polygon_area = -polygon_area;
+
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id, ACCESS_TYPE_UNICAST, subphy_num,
+				    DDR_PHY_DATA, RESULT_PHY_REG + effective_cs + 4 * (1 - mode),
+				    polygon_area);
+	if (status != MV_OK)
+		return status;
+
+	status = ddr3_tip_bus_read(dev_num, if_id, ACCESS_TYPE_UNICAST,
+				   dmin_phy_reg_table[effective_cs * 5 + subphy_num][0], DDR_PHY_CONTROL,
+				   dmin_phy_reg_table[effective_cs * 5 + subphy_num][1], &reg_val);
+	if (status != MV_OK)
+		return status;
+
+	reg_val &= 0xff << (8 * mode); /* rx clean bits 0..8, tx bits 9..16 */
+	reg_val |= d_min / 2 << (8 * (1 - mode)); /* rX write bits 0..8, tx bits 9..16 */
+
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id, ACCESS_TYPE_UNICAST,
+				    dmin_phy_reg_table[effective_cs * 5 + subphy_num][0], DDR_PHY_CONTROL,
+				    dmin_phy_reg_table[effective_cs * 5 + subphy_num][1], reg_val);
+	if (status != MV_OK)
+		return status;
+
+	if (polygon_area < 400) {
+		DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+				  ("%s: if %d, subphy %d: poligon area too small %d (dmin %d)\n",
+				   __func__, if_id, subphy_num, polygon_area, d_min));
+		if (debug_mode == 0)
+			return MV_FAIL;
+	}
+
+	return MV_OK;
+}
+
+/* tap tuning flow */
+enum {
+	DQS_TO_DQ_LONG,
+	DQS_TO_DQ_SHORT
+};
+enum {
+	ALIGN_LEFT,
+	ALIGN_CENTER,
+	ALIGN_RIGHT
+};
+#define ONE_MHZ			1000000
+#define MAX_SKEW_DLY		200 /* in ps */
+#define NOMINAL_PBS_DLY		9 /* in ps */
+#define MIN_WL_TO_CTX_ADLL_DIFF	2 /* in taps */
+#define DQS_SHIFT_INIT_VAL	30
+#define MAX_PBS_NUM		31
+#define ADLL_TAPS_PER_PHASE	32
+#define ADLL_TAPS_PER_PERIOD	(ADLL_TAPS_PER_PHASE * 2)
+#define ADLL_TX_RES_REG_MASK	0xff
+#define VW_DESKEW_BIAS		0xa
+static int mv_ddr4_tap_tuning(u8 dev, u16 (*pbs_tap_factor)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS], u8 mode)
+{
+	enum hws_training_ip_stat training_result[MAX_INTERFACE_NUM];
+	u32 iface, subphy, bit, pattern;
+	u32 limit_div;
+	u8 curr_start_win, curr_end_win;
+	u8 upd_curr_start_win, upd_curr_end_win;
+	u8 start_win_diff, end_win_diff;
+	u32 max_win_size, a, b;
+	u32 cs_ena_reg_val[MAX_INTERFACE_NUM];
+	u32 reg_addr;
+	enum hws_search_dir search_dir;
+	enum hws_dir dir;
+	u32 *result[MAX_BUS_NUM][HWS_SEARCH_DIR_LIMIT];
+	u32 result1[MAX_BUS_NUM][HWS_SEARCH_DIR_LIMIT][BUS_WIDTH_IN_BITS];
+	u8 subphy_max = ddr3_tip_dev_attr_get(dev, MV_ATTR_OCTET_PER_INTERFACE);
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	enum hws_training_result result_type = RESULT_PER_BIT;
+	int status = MV_OK;
+	int i;
+	u32 reg_val;
+	u32 freq = mv_ddr_freq_get(tm->interface_params->memory_freq);
+	/* calc adll tap in ps based on frequency */
+	int adll_tap = (ONE_MHZ / freq) / ADLL_TAPS_PER_PERIOD;
+	int dq_to_dqs_delta[MAX_BUS_NUM][BUS_WIDTH_IN_BITS]; /* skew b/w dq and dqs */
+	u32 wl_adll[MAX_BUS_NUM]; /* wl solution adll value */
+	int is_dq_dqs_short[MAX_BUS_NUM] = {0}; /* tx byte's state */
+	u32 new_pbs_per_byte[MAX_BUS_NUM]; /* dq pads' pbs value correction */
+	/* threshold to decide subphy needs dqs pbs delay */
+	int dq_to_dqs_min_delta_threshold = MIN_WL_TO_CTX_ADLL_DIFF + MAX_SKEW_DLY / adll_tap;
+	/* search init condition */
+	int dq_to_dqs_min_delta = dq_to_dqs_min_delta_threshold * 2;
+	u32 pbs_tap_factor0 = PBS_VAL_FACTOR * NOMINAL_PBS_DLY / adll_tap; /* init lambda */
+	/* adapt pbs to frequency */
+	u32 new_pbs = (1810000 - (345 * freq)) / 100000;
+	int stage_num, loop;
+	int wl_tap, new_wl_tap;
+	int pbs_tap_factor_avg;
+	int dqs_shift[MAX_BUS_NUM]; /* dqs' pbs delay */
+	static u16 tmp_pbs_tap_factor[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO, ("Starting ddr4 tap tuning training stage\n"));
+
+	for (i = 0; i < MAX_BUS_NUM; i++)
+		dqs_shift[i] = DQS_SHIFT_INIT_VAL;
+
+	if (mode == TX_DIR) {
+		max_win_size = MAX_WINDOW_SIZE_TX;
+		dir = OPER_WRITE;
+	} else {
+		max_win_size = MAX_WINDOW_SIZE_RX;
+		dir = OPER_READ;
+	}
+
+	/* init all pbs registers */
+	for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+		if (mode == RX_DIR)
+			reg_addr = PBS_RX_BCAST_PHY_REG(effective_cs);
+		else
+			reg_addr = PBS_TX_BCAST_PHY_REG(effective_cs);
+		ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_MULTICAST,
+				   PARAM_NOT_CARE, DDR_PHY_DATA, reg_addr, 0);
+
+		if (mode == RX_DIR)
+			reg_addr = PBS_RX_PHY_REG(effective_cs, DQSP_PAD);
+		else
+			reg_addr = PBS_TX_PHY_REG(effective_cs, DQSP_PAD);
+		ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_MULTICAST,
+				   PARAM_NOT_CARE, DDR_PHY_DATA, reg_addr, 0);
+		if (mode == RX_DIR)
+			reg_addr = PBS_RX_PHY_REG(effective_cs, DQSN_PAD);
+		else
+			reg_addr = PBS_TX_PHY_REG(effective_cs, DQSN_PAD);
+		ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_MULTICAST,
+				   PARAM_NOT_CARE, DDR_PHY_DATA, reg_addr, 0);
+	}
+
+	for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+		/* save current cs enable reg val */
+		ddr3_tip_if_read(dev, ACCESS_TYPE_UNICAST, iface, DUAL_DUNIT_CFG_REG,
+				 cs_ena_reg_val, MASK_ALL_BITS);
+
+		/* enable single cs */
+		ddr3_tip_if_write(dev, ACCESS_TYPE_UNICAST, iface, DUAL_DUNIT_CFG_REG,
+				  (SINGLE_CS_ENA << SINGLE_CS_PIN_OFFS),
+				  (SINGLE_CS_PIN_MASK << SINGLE_CS_PIN_OFFS));
+	}
+
+	/* FIXME: fix this hard-coded parameters due to compilation issue with patterns definitions */
+	pattern = MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask) ? 73 : 23;
+	stage_num = (mode == RX_DIR) ? 1 : 2;
+	/* find window; run training */
+	for (loop = 0; loop < stage_num; loop++) {
+		ddr3_tip_ip_training_wrapper(dev, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+					     PARAM_NOT_CARE, result_type, HWS_CONTROL_ELEMENT_ADLL, PARAM_NOT_CARE,
+					     dir, tm->if_act_mask, 0x0, max_win_size - 1, max_win_size - 1,
+					     pattern, EDGE_FPF, CS_SINGLE, PARAM_NOT_CARE, training_result);
+
+		for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+			for (subphy = 0; subphy < subphy_max; subphy++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+				rx_vw_pos[iface][subphy] = ALIGN_CENTER;
+				new_pbs_per_byte[subphy] = new_pbs; /* rx init */
+				if ((mode == TX_DIR) && (loop == 0)) {
+					/* read nominal wl */
+					ddr3_tip_bus_read(dev, iface, ACCESS_TYPE_UNICAST, subphy,
+							  DDR_PHY_DATA, WL_PHY_REG(effective_cs),
+							  &reg_val);
+					wl_adll[subphy] = reg_val;
+				}
+
+				for (search_dir = HWS_LOW2HIGH; search_dir <= HWS_HIGH2LOW; search_dir++) {
+					ddr3_tip_read_training_result(dev, iface, ACCESS_TYPE_UNICAST, subphy,
+								      ALL_BITS_PER_PUP, search_dir, dir,
+								      result_type, TRAINING_LOAD_OPERATION_UNLOAD,
+								      CS_SINGLE, &(result[subphy][search_dir]),
+								      1, 0, 0);
+
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("cs %d if %d subphy %d mode %d result: "
+								 "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+									 effective_cs, iface, subphy, mode,
+								 result[subphy][search_dir][0],
+								 result[subphy][search_dir][1],
+								 result[subphy][search_dir][2],
+								 result[subphy][search_dir][3],
+								 result[subphy][search_dir][4],
+								 result[subphy][search_dir][5],
+								 result[subphy][search_dir][6],
+								 result[subphy][search_dir][7]));
+				}
+
+				for (bit = 0; bit < BUS_WIDTH_IN_BITS; bit++) {
+					a = result[subphy][HWS_LOW2HIGH][bit];
+					b = result[subphy][HWS_HIGH2LOW][bit];
+					result1[subphy][HWS_LOW2HIGH][bit] = a;
+					result1[subphy][HWS_HIGH2LOW][bit] = b;
+					/* measure distance between ctx and wl adlls */
+					if (mode == TX_DIR) {
+						a &= ADLL_TX_RES_REG_MASK;
+						if (a >= ADLL_TAPS_PER_PERIOD)
+							a -= ADLL_TAPS_PER_PERIOD;
+						dq_to_dqs_delta[subphy][bit] =
+							a - (wl_adll[subphy] & WR_LVL_REF_DLY_MASK);
+						if (dq_to_dqs_delta[subphy][bit] < dq_to_dqs_min_delta)
+							dq_to_dqs_min_delta = dq_to_dqs_delta[subphy][bit];
+						DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+									("%s: dq_to_dqs_delta[%d][%d] %d\n",
+									 __func__, subphy, bit,
+									 dq_to_dqs_delta[subphy][bit]));
+					}
+				}
+
+				/* adjust wl on the first pass only */
+				if ((mode == TX_DIR) && (loop == 0)) {
+					/* dqs pbs shift if distance b/w adll is too large */
+					if (dq_to_dqs_min_delta < dq_to_dqs_min_delta_threshold) {
+						/* first calculate the WL in taps */
+						wl_tap = ((wl_adll[subphy] >> WR_LVL_REF_DLY_OFFS) &
+							  WR_LVL_REF_DLY_MASK) +
+							  ((wl_adll[subphy] >> WR_LVL_PH_SEL_OFFS) &
+							  WR_LVL_PH_SEL_MASK) * ADLL_TAPS_PER_PHASE;
+
+						/* calc dqs pbs shift */
+						dqs_shift[subphy] =
+							dq_to_dqs_min_delta_threshold - dq_to_dqs_min_delta;
+						/* check that the WL result have enough taps to reduce */
+						if (wl_tap > 0) {
+							if (wl_tap < dqs_shift[subphy])
+								dqs_shift[subphy] = wl_tap-1;
+							else
+								dqs_shift[subphy] = dqs_shift[subphy];
+						} else {
+							dqs_shift[subphy] = 0;
+						}
+						DEBUG_TAP_TUNING_ENGINE
+							(DEBUG_LEVEL_INFO,
+							 ("%s: tap tune tx: subphy %d, dqs shifted by %d adll taps, ",
+									 __func__, subphy, dqs_shift[subphy]));
+						dqs_shift[subphy] =
+							(dqs_shift[subphy] * PBS_VAL_FACTOR) / pbs_tap_factor0;
+						DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+									("%d pbs taps\n", dqs_shift[subphy]));
+						/* check high limit */
+						if (dqs_shift[subphy] > MAX_PBS_NUM)
+							dqs_shift[subphy] = MAX_PBS_NUM;
+						reg_addr = PBS_TX_PHY_REG(effective_cs, DQSP_PAD);
+						ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface,
+								   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+								   reg_addr, dqs_shift[subphy]);
+						reg_addr = PBS_TX_PHY_REG(effective_cs, DQSN_PAD);
+						ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface,
+								   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+								   reg_addr, dqs_shift[subphy]);
+
+						is_dq_dqs_short[subphy] = DQS_TO_DQ_SHORT;
+
+						new_wl_tap = wl_tap -
+							     (dqs_shift[subphy] * pbs_tap_factor0) / PBS_VAL_FACTOR;
+						reg_val = (new_wl_tap & WR_LVL_REF_DLY_MASK) |
+							  ((new_wl_tap &
+							    ((WR_LVL_PH_SEL_MASK << WR_LVL_PH_SEL_OFFS) >> 1))
+							   << 1) |
+							  (wl_adll[subphy] &
+							   ((CTRL_CENTER_DLY_MASK << CTRL_CENTER_DLY_OFFS) |
+							    (CTRL_CENTER_DLY_INV_MASK << CTRL_CENTER_DLY_INV_OFFS)));
+						ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface,
+								   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+								   WL_PHY_REG(effective_cs), reg_val);
+						DEBUG_TAP_TUNING_ENGINE
+							(DEBUG_LEVEL_INFO,
+							 ("%s: subphy %d, dq_to_dqs_min_delta %d, dqs_shift %d, old wl %d, temp wl %d 0x%08x\n",
+									 __func__, subphy, dq_to_dqs_min_delta,
+									 dqs_shift[subphy], wl_tap, new_wl_tap,
+									 reg_val));
+					}
+				}
+				dq_to_dqs_min_delta = dq_to_dqs_min_delta_threshold * 2;
+			}
+		}
+	}
+
+	/* deskew dq */
+	for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+		if (mode == RX_DIR)
+			reg_addr = PBS_RX_BCAST_PHY_REG(effective_cs);
+		else
+			reg_addr = PBS_TX_BCAST_PHY_REG(effective_cs);
+		ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+				   DDR_PHY_DATA, reg_addr, new_pbs_per_byte[0]);
+	 }
+
+	/* run training search and get results */
+	ddr3_tip_ip_training_wrapper(dev, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+				     PARAM_NOT_CARE, result_type, HWS_CONTROL_ELEMENT_ADLL, PARAM_NOT_CARE,
+				     dir, tm->if_act_mask, 0x0, max_win_size - 1, max_win_size - 1,
+				     pattern, EDGE_FPF, CS_SINGLE, PARAM_NOT_CARE, training_result);
+
+	for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+		for (subphy = 0; subphy < subphy_max; subphy++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+			/* read training ip results from db */
+			for (search_dir = HWS_LOW2HIGH; search_dir <= HWS_HIGH2LOW; search_dir++) {
+				ddr3_tip_read_training_result(dev, iface, ACCESS_TYPE_UNICAST,
+							      subphy, ALL_BITS_PER_PUP, search_dir,
+							      dir, result_type,
+							      TRAINING_LOAD_OPERATION_UNLOAD, CS_SINGLE,
+							      &(result[subphy][search_dir]),
+							      1, 0, 0);
+
+				DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+							("cs %d if %d subphy %d mode %d result: "
+							 "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+							 effective_cs, iface, subphy, mode,
+							 result[subphy][search_dir][0],
+							 result[subphy][search_dir][1],
+							 result[subphy][search_dir][2],
+							 result[subphy][search_dir][3],
+							 result[subphy][search_dir][4],
+							 result[subphy][search_dir][5],
+							 result[subphy][search_dir][6],
+							 result[subphy][search_dir][7]));
+			}
+
+			/* calc dq skew impact on vw position */
+			for (bit = 0; bit < BUS_WIDTH_IN_BITS; bit++) {
+				start_win_diff = 0;
+				end_win_diff = 0;
+				limit_div = 0;
+				if ((GET_LOCK_RESULT(result1[subphy][HWS_LOW2HIGH][bit]) == 1) &&
+				    (GET_LOCK_RESULT(result1[subphy][HWS_HIGH2LOW][bit]) == 1) &&
+				    (GET_LOCK_RESULT(result[subphy][HWS_LOW2HIGH][bit]) == 1) &&
+				    (GET_LOCK_RESULT(result[subphy][HWS_HIGH2LOW][bit]) == 1)) {
+					curr_start_win = GET_TAP_RESULT(result1[subphy][HWS_LOW2HIGH][bit],
+									EDGE_1);
+					curr_end_win = GET_TAP_RESULT(result1[subphy][HWS_HIGH2LOW][bit],
+								      EDGE_1);
+					upd_curr_start_win = GET_TAP_RESULT(result[subphy][HWS_LOW2HIGH][bit],
+									    EDGE_1);
+					upd_curr_end_win = GET_TAP_RESULT(result[subphy][HWS_HIGH2LOW][bit],
+									  EDGE_1);
+
+					/* update tx start skew; set rx vw position */
+					if ((upd_curr_start_win != 0) && (curr_start_win != 0)) {
+						if (upd_curr_start_win > curr_start_win) {
+							start_win_diff = upd_curr_start_win - curr_start_win;
+							if (mode == TX_DIR)
+								start_win_diff =
+									curr_start_win + 64 - upd_curr_start_win;
+						} else {
+							start_win_diff = curr_start_win - upd_curr_start_win;
+						}
+						limit_div++;
+					} else {
+						rx_vw_pos[iface][subphy] = ALIGN_LEFT;
+					}
+
+					/* update tx end skew; set rx vw position */
+					if (((upd_curr_end_win != max_win_size) && (curr_end_win != max_win_size)) ||
+					    (mode == TX_DIR)) {
+						if (upd_curr_end_win  > curr_end_win) {
+							end_win_diff = upd_curr_end_win - curr_end_win;
+							if (mode == TX_DIR)
+								end_win_diff =
+									curr_end_win + 64 - upd_curr_end_win;
+						} else {
+							end_win_diff = curr_end_win - upd_curr_end_win;
+						}
+						limit_div++;
+					} else {
+						rx_vw_pos[iface][subphy] = ALIGN_RIGHT;
+					}
+
+					/*
+					 * don't care about start in tx mode
+					 * TODO: temporary solution for instability in the start adll search
+					 */
+					if (mode == TX_DIR) {
+						start_win_diff = end_win_diff;
+						limit_div = 2;
+					}
+
+					/*
+					 * workaround for false tx measurements in tap tune stage
+					 * tx pbs factor will use rx pbs factor results instead
+					 */
+					if ((limit_div != 0) && (mode == RX_DIR)) {
+						pbs_tap_factor[iface][subphy][bit] =
+							PBS_VAL_FACTOR * (start_win_diff + end_win_diff) /
+							(new_pbs_per_byte[subphy] * limit_div);
+						tmp_pbs_tap_factor[iface][subphy][bit] =
+							pbs_tap_factor[iface][subphy][bit];
+					} else {
+						pbs_tap_factor[iface][subphy][bit] =
+							tmp_pbs_tap_factor[iface][subphy][bit];
+					}
+
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("cs %d if %d subphy %d bit %d sw1 %d sw2 %d "
+								 "ew1 %d ew2 %d sum delta %d, align %d\n",
+								 effective_cs, iface, subphy, bit,
+								 curr_start_win, upd_curr_start_win,
+								 curr_end_win, upd_curr_end_win,
+								 pbs_tap_factor[iface][subphy][bit],
+								 rx_vw_pos[iface][subphy]));
+				} else {
+					status = MV_FAIL;
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("tap tuning fail %s cs %d if %d subphy %d bit %d\n",
+								 (mode == RX_DIR) ? "RX" : "TX", effective_cs, iface,
+								 subphy, bit));
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("cs %d if %d subphy %d mode %d result: "
+								 "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+								 effective_cs, iface, subphy, mode,
+								 result[subphy][HWS_LOW2HIGH][0],
+								 result[subphy][HWS_LOW2HIGH][1],
+								 result[subphy][HWS_LOW2HIGH][2],
+								 result[subphy][HWS_LOW2HIGH][3],
+								 result[subphy][HWS_LOW2HIGH][4],
+								 result[subphy][HWS_LOW2HIGH][5],
+								 result[subphy][HWS_LOW2HIGH][6],
+								 result[subphy][HWS_LOW2HIGH][7]));
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("cs %d if %d subphy %d mode %d result: "
+								 "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+								 effective_cs, iface, subphy, mode,
+								 result[subphy][HWS_HIGH2LOW][0],
+								 result[subphy][HWS_HIGH2LOW][1],
+								 result[subphy][HWS_HIGH2LOW][2],
+								 result[subphy][HWS_HIGH2LOW][3],
+								 result[subphy][HWS_HIGH2LOW][4],
+								 result[subphy][HWS_HIGH2LOW][5],
+								 result[subphy][HWS_HIGH2LOW][6],
+								 result[subphy][HWS_HIGH2LOW][7]));
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("cs %d if %d subphy %d mode %d result: "
+								 "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+								 effective_cs, iface, subphy, mode,
+								 result1[subphy][HWS_LOW2HIGH][0],
+								 result1[subphy][HWS_LOW2HIGH][1],
+								 result1[subphy][HWS_LOW2HIGH][2],
+								 result1[subphy][HWS_LOW2HIGH][3],
+								 result1[subphy][HWS_LOW2HIGH][4],
+								 result1[subphy][HWS_LOW2HIGH][5],
+								 result1[subphy][HWS_LOW2HIGH][6],
+								 result1[subphy][HWS_LOW2HIGH][7]));
+					DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+								("cs %d if %d subphy %d mode %d result: "
+								 "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+								 effective_cs, iface, subphy, mode,
+								 result1[subphy][HWS_HIGH2LOW][0],
+								 result1[subphy][HWS_HIGH2LOW][1],
+								 result1[subphy][HWS_HIGH2LOW][2],
+								 result1[subphy][HWS_HIGH2LOW][3],
+								 result1[subphy][HWS_HIGH2LOW][4],
+								 result1[subphy][HWS_HIGH2LOW][5],
+								 result1[subphy][HWS_HIGH2LOW][6],
+								 result1[subphy][HWS_HIGH2LOW][7]));
+				}
+			}
+		}
+	}
+
+	/* restore cs enable value */
+	for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+		ddr3_tip_if_write(dev, ACCESS_TYPE_UNICAST, iface, DUAL_DUNIT_CFG_REG,
+				  cs_ena_reg_val[iface], MASK_ALL_BITS);
+	}
+
+	/* restore pbs (set to 0) */
+	for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+		for (subphy = 0; subphy < subphy_max; subphy++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+			if (mode == RX_DIR)
+				reg_addr = PBS_RX_BCAST_PHY_REG(effective_cs);
+			else
+				reg_addr = PBS_TX_BCAST_PHY_REG(effective_cs);
+			ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_UNICAST,
+					   subphy, DDR_PHY_DATA, reg_addr, 0);
+		}
+	}
+
+	/* set deskew bias for rx valid window */
+	if (mode == RX_DIR) {
+		/*
+		 * pattern special for rx
+		 * check for rx_vw_pos stat
+		 * - add n pbs taps to every dq to align to left (pbs_max set to (31 - n))
+		 * - add pbs taps to dqs to align to right
+		 */
+		for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+			for (subphy = 0; subphy < subphy_max; subphy++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+				if (rx_vw_pos[iface][subphy] == ALIGN_LEFT) {
+					ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, 0,
+							   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+							   PBS_RX_BCAST_PHY_REG(effective_cs),
+							   VW_DESKEW_BIAS);
+					DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+							  ("%s: if %d, subphy %d aligned to left\n",
+							   __func__, iface, subphy));
+				} else if (rx_vw_pos[iface][subphy] == ALIGN_RIGHT) {
+					reg_addr = PBS_RX_PHY_REG(effective_cs, DQSP_PAD);
+					ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, 0,
+							   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+							   reg_addr, VW_DESKEW_BIAS);
+					reg_addr = PBS_RX_PHY_REG(effective_cs, DQSN_PAD);
+					ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, 0,
+							   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+							   reg_addr, VW_DESKEW_BIAS);
+					DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+							  ("%s: if %d , subphy %d aligned to right\n",
+							   __func__, iface, subphy));
+				}
+			} /* subphy */
+		} /* if */
+	} else { /* tx mode */
+		/* update wl solution */
+		if (status == MV_OK) {
+			for (iface = 0; iface < MAX_INTERFACE_NUM; iface++) {
+				VALIDATE_IF_ACTIVE(tm->if_act_mask, iface);
+				for (subphy = 0; subphy < subphy_max; subphy++) {
+					VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+					if (is_dq_dqs_short[subphy]) {
+						wl_tap = ((wl_adll[subphy] >> WR_LVL_REF_DLY_OFFS) &
+							  WR_LVL_REF_DLY_MASK) +
+							 ((wl_adll[subphy] >> WR_LVL_PH_SEL_OFFS) &
+							  WR_LVL_PH_SEL_MASK) * ADLL_TAPS_PER_PHASE;
+						pbs_tap_factor_avg = (pbs_tap_factor[iface][subphy][0] +
+								      pbs_tap_factor[iface][subphy][1] +
+								      pbs_tap_factor[iface][subphy][2] +
+								      pbs_tap_factor[iface][subphy][3] +
+								      pbs_tap_factor[iface][subphy][4] +
+								      pbs_tap_factor[iface][subphy][5] +
+								      pbs_tap_factor[iface][subphy][6] +
+								      pbs_tap_factor[iface][subphy][7]) / 8;
+						DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+									("%s: pbs_tap_factor_avg %d\n",
+									 __func__, pbs_tap_factor_avg));
+						new_wl_tap = wl_tap -
+							     (dqs_shift[subphy] * pbs_tap_factor_avg) /
+							     PBS_VAL_FACTOR;
+						/*
+						 * check wraparound due to change in the pbs_tap_factor_avg
+						 * vs the first guess
+						 */
+						if (new_wl_tap <= 0)
+							new_wl_tap = 0;
+
+						reg_val = (new_wl_tap & WR_LVL_REF_DLY_MASK) |
+							  ((new_wl_tap &
+							    ((WR_LVL_PH_SEL_MASK << WR_LVL_PH_SEL_OFFS) >> 1))
+							   << 1) |
+							  (wl_adll[subphy] &
+							   ((CTRL_CENTER_DLY_MASK << CTRL_CENTER_DLY_OFFS) |
+							    (CTRL_CENTER_DLY_INV_MASK << CTRL_CENTER_DLY_INV_OFFS)));
+						ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface,
+								   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+								   WL_PHY_REG(effective_cs), reg_val);
+						DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+									("%s: tap tune tx algorithm final wl:\n",
+									 __func__));
+						DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+									("%s: subphy %d, dqs pbs %d, old wl %d, final wl %d 0x%08x -> 0x%08x\n",
+									 __func__, subphy, pbs_tap_factor_avg,
+									 wl_tap, new_wl_tap, wl_adll[subphy],
+									 reg_val));
+					}
+				}
+			}
+		} else {
+			/* return to nominal wl */
+			for (subphy = 0; subphy < subphy_max; subphy++) {
+				ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_UNICAST,
+						   subphy, DDR_PHY_DATA, WL_PHY_REG(effective_cs),
+						   wl_adll[subphy]);
+				DEBUG_TAP_TUNING_ENGINE(DEBUG_LEVEL_INFO,
+							("%s: tap tune failed; return to nominal wl\n",
+							__func__));
+				reg_addr = PBS_TX_PHY_REG(effective_cs, DQSP_PAD);
+				ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_UNICAST,
+						   subphy, DDR_PHY_DATA, reg_addr, 0);
+				reg_addr = PBS_TX_PHY_REG(effective_cs, DQSN_PAD);
+				ddr3_tip_bus_write(dev, ACCESS_TYPE_UNICAST, iface, ACCESS_TYPE_UNICAST,
+						   subphy, DDR_PHY_DATA, reg_addr, 0);
+			}
+		}
+	}
+
+	return status;
+}
+
+/* receiver duty cycle flow */
+#define DDR_PHY_JIRA_ENABLE
+int mv_ddr4_receiver_calibration(u8 dev_num)
+{
+	u32  if_id, subphy_num;
+	u32 vref_idx, dq_idx, pad_num = 0;
+	u8 dq_vref_start_win[MAX_INTERFACE_NUM][MAX_BUS_NUM][RECEIVER_DC_MAX_COUNT];
+	u8 dq_vref_end_win[MAX_INTERFACE_NUM][MAX_BUS_NUM][RECEIVER_DC_MAX_COUNT];
+	u8 c_vref[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 valid_win_size[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 c_opt_per_bus[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 valid_vref_cnt[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 valid_vref_ptr[MAX_INTERFACE_NUM][MAX_BUS_NUM][RECEIVER_DC_MAX_COUNT];
+	u8 center_adll[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 center_vref[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 pbs_res_per_bus[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	u16 lambda_per_dq[MAX_INTERFACE_NUM][MAX_BUS_NUM][BUS_WIDTH_IN_BITS];
+	u8 dqs_pbs = 0, const_pbs;
+	int tap_tune_passed = 0;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	enum hws_result *flow_result = ddr3_tip_get_result_ptr(training_stage);
+	u8 subphy_max = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+#ifdef DDR_PHY_JIRA_ENABLE
+	u32  dqs_pbs_jira56[MAX_INTERFACE_NUM][MAX_BUS_NUM];
+	u8 delta = 0;
+#endif
+	unsigned int max_cs = mv_ddr_cs_num_get();
+	u32 ctr_x[4], pbs_temp[4];
+	u16 cs_index = 0, pbs_rx_avg, lambda_avg;
+	int status;
+
+	DEBUG_CALIBRATION(DEBUG_LEVEL_INFO, ("Starting ddr4 dc calibration training stage\n"));
+
+	vdq_tv = 0;
+	duty_cycle = 0;
+
+	/* reset valid vref counter per if and subphy */
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++)
+		for (subphy_num = 0; subphy_num < MAX_BUS_NUM; subphy_num++)
+			valid_vref_cnt[if_id][subphy_num] = 0;
+
+	/* calculate pbs-adll tap tuning */
+	/* reset special pattern configuration to re-run this stage */
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+			   DDR_PHY_DATA, 0x5f + effective_cs * 0x10, 0x0);
+	if (status != MV_OK)
+		return status;
+
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+			   DDR_PHY_DATA, 0x54 + effective_cs * 0x10, 0x0);
+	if (status != MV_OK)
+		return status;
+
+	status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+			   DDR_PHY_DATA, 0x55 + effective_cs * 0x10, 0x0);
+	if (status != MV_OK)
+		return status;
+
+#ifdef DDR_PHY_JIRA_ENABLE
+	if (effective_cs != 0) {
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				status = ddr3_tip_bus_read(dev_num, if_id, ACCESS_TYPE_UNICAST, subphy_num,
+							   DDR_PHY_DATA, 0x54 + 0 * 0x10,
+							   &dqs_pbs_jira56[if_id][subphy_num]);
+				if (status != MV_OK)
+					return status;
+
+				status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST,
+							    subphy_num, DDR_PHY_DATA, 0x54 + 0 * 0x10, 0x0);
+				if (status != MV_OK)
+					return status;
+
+				status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST,
+							    subphy_num, DDR_PHY_DATA, 0x55 + 0 * 0x10, 0x0);
+				if (status != MV_OK)
+					return status;
+			}
+		}
+	}
+#endif
+
+	if (mv_ddr4_tap_tuning(dev_num, lambda_per_dq, RX_DIR) == MV_OK)
+		tap_tune_passed = 1;
+
+	/* main loop for 2d scan (low_to_high voltage scan) */
+	for (duty_cycle = RECEIVER_DC_MIN_RANGE;
+	     duty_cycle <= RECEIVER_DC_MAX_RANGE;
+	     duty_cycle += RECEIVER_DC_STEP_SIZE) {
+		/* set new receiver dc training value in dram */
+		status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+					    ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, DDR_PHY_DATA,
+					    VREF_BCAST_PHY_REG(effective_cs), duty_cycle);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+					    ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, DDR_PHY_DATA,
+					    VREF_PHY_REG(effective_cs, DQSP_PAD), duty_cycle);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+					    ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, DDR_PHY_DATA,
+					    VREF_PHY_REG(effective_cs, DQSN_PAD), duty_cycle);
+		if (status != MV_OK)
+			return status;
+
+		if (tap_tune_passed == 0) {
+			if (mv_ddr4_tap_tuning(dev_num, lambda_per_dq, RX_DIR) == MV_OK) {
+				tap_tune_passed = 1;
+			} else {
+				DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+						  ("rc, tap tune failed inside calibration\n"));
+				continue;
+			}
+		}
+
+		if (mv_ddr4_centralization(dev_num, lambda_per_dq, c_opt_per_bus, pbs_res_per_bus,
+					   valid_win_size, RX_DIR, vdq_tv, duty_cycle) != MV_OK) {
+			DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+					  ("error: ddr4 centralization failed (duty_cycle %d)!!!\n", duty_cycle));
+			if (debug_mode == 0)
+				break;
+		}
+
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				if (valid_win_size[if_id][subphy_num] > 8) {
+					/* window is valid; keep current duty_cycle value and increment counter */
+					vref_idx = valid_vref_cnt[if_id][subphy_num];
+					valid_vref_ptr[if_id][subphy_num][vref_idx] = duty_cycle;
+					valid_vref_cnt[if_id][subphy_num]++;
+					c_vref[if_id][subphy_num] = c_opt_per_bus[if_id][subphy_num];
+					/* set 0 for possible negative values */
+					dq_vref_start_win[if_id][subphy_num][vref_idx] =
+						c_vref[if_id][subphy_num] + 1 - valid_win_size[if_id][subphy_num] / 2;
+					dq_vref_start_win[if_id][subphy_num][vref_idx] =
+						(valid_win_size[if_id][subphy_num] % 2 == 0) ?
+						dq_vref_start_win[if_id][subphy_num][vref_idx] :
+						dq_vref_start_win[if_id][subphy_num][vref_idx] - 1;
+					dq_vref_end_win[if_id][subphy_num][vref_idx] =
+						c_vref[if_id][subphy_num] + valid_win_size[if_id][subphy_num] / 2;
+				}
+			} /* subphy */
+		} /* if */
+	} /* duty_cycle */
+
+	if (tap_tune_passed == 0) {
+		DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+				  ("%s: tap tune not passed on any duty_cycle value\n", __func__));
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			/* report fail for all active interfaces; multi-interface support - tbd */
+			flow_result[if_id] = TEST_FAILED;
+		}
+
+		return MV_FAIL;
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+					  ("calculating center of mass for subphy %d, valid window size %d\n",
+					   subphy_num, valid_win_size[if_id][subphy_num]));
+			if (valid_vref_cnt[if_id][subphy_num] > 0) {
+				rx_eye_hi_lvl[subphy_num] =
+					valid_vref_ptr[if_id][subphy_num][valid_vref_cnt[if_id][subphy_num] - 1];
+				rx_eye_lo_lvl[subphy_num] = valid_vref_ptr[if_id][subphy_num][0];
+				/* calculate center of mass sampling point (t, v) for each subphy */
+				status = mv_ddr4_center_of_mass_calc(dev_num, if_id, subphy_num, RX_DIR,
+								     dq_vref_start_win[if_id][subphy_num],
+								     dq_vref_end_win[if_id][subphy_num],
+								     valid_vref_ptr[if_id][subphy_num],
+								     valid_vref_cnt[if_id][subphy_num],
+								     &center_vref[if_id][subphy_num],
+								     &center_adll[if_id][subphy_num]);
+				if (status != MV_OK)
+					return status;
+
+				DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+						  ("center of mass results: vref %d, adll %d\n",
+						   center_vref[if_id][subphy_num], center_adll[if_id][subphy_num]));
+			} else {
+				DEBUG_CALIBRATION(DEBUG_LEVEL_ERROR,
+						  ("%s: no valid window found for cs %d, subphy %d\n",
+						   __func__, effective_cs, subphy_num));
+				return MV_FAIL;
+			}
+		}
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+						    ACCESS_TYPE_UNICAST, subphy_num, DDR_PHY_DATA,
+						    VREF_BCAST_PHY_REG(effective_cs),
+						    center_vref[if_id][subphy_num]);
+			if (status != MV_OK)
+				return status;
+
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+						    ACCESS_TYPE_UNICAST, subphy_num, DDR_PHY_DATA,
+						    VREF_PHY_REG(effective_cs, DQSP_PAD),
+						    center_vref[if_id][subphy_num]);
+			if (status != MV_OK)
+				return status;
+
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
+						    ACCESS_TYPE_UNICAST, subphy_num, DDR_PHY_DATA,
+						    VREF_PHY_REG(effective_cs, DQSN_PAD),
+						    center_vref[if_id][subphy_num]);
+			if (status != MV_OK)
+				return status;
+
+			DEBUG_CALIBRATION(DEBUG_LEVEL_INFO, ("final dc %d\n", center_vref[if_id][subphy_num]));
+		}
+
+		/* run centralization again with optimal vref to update global structures */
+		mv_ddr4_centralization(dev_num, lambda_per_dq, c_opt_per_bus, pbs_res_per_bus, valid_win_size,
+				       RX_DIR, 0, center_vref[if_id][0]);
+
+		for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+
+			const_pbs = 0xa;
+			mv_ddr4_dqs_reposition(RX_DIR, lambda_per_dq[if_id][subphy_num],
+					       pbs_res_per_bus[if_id][subphy_num], 0x0,
+					       &center_adll[if_id][subphy_num], &dqs_pbs);
+
+			/* dq pbs update */
+			for (dq_idx = 0; dq_idx < 8 ; dq_idx++) {
+				pad_num = dq_map_table[dq_idx +
+						       subphy_num * BUS_WIDTH_IN_BITS +
+						       if_id * BUS_WIDTH_IN_BITS * subphy_max];
+				status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id, ACCESS_TYPE_UNICAST,
+							    subphy_num, DDR_PHY_DATA,
+							    0x50 + pad_num + effective_cs * 0x10,
+							    const_pbs + pbs_res_per_bus[if_id][subphy_num][dq_idx]);
+				if (status != MV_OK)
+					return status;
+			}
+
+			/* dqs pbs update */
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST, subphy_num,
+						    DDR_PHY_DATA, 0x54 + effective_cs * 0x10, dqs_pbs);
+			if (status != MV_OK)
+				return status;
+
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST, subphy_num,
+						    DDR_PHY_DATA, 0x55 + effective_cs * 0x10, dqs_pbs);
+			if (status != MV_OK)
+				return status;
+
+			status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id, ACCESS_TYPE_UNICAST,
+						    subphy_num, DDR_PHY_DATA,
+						    CRX_PHY_REG(effective_cs),
+						    center_adll[if_id][subphy_num]);
+			if (status != MV_OK)
+				return status;
+
+#ifdef DDR_PHY_JIRA_ENABLE
+			if (effective_cs != 0) {
+				status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST,
+							    subphy_num, DDR_PHY_DATA, 0x54 + 0 * 0x10,
+							    dqs_pbs_jira56[if_id][subphy_num]);
+				if (status != MV_OK)
+					return status;
+
+				status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST,
+							    subphy_num, DDR_PHY_DATA, 0x55 + 0 * 0x10,
+							    dqs_pbs_jira56[if_id][subphy_num]);
+				if (status != MV_OK)
+					return status;
+			}
+#endif
+		}
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		/* report pass for all active interfaces; multi-interface support - tbd */
+		flow_result[if_id] = TEST_SUCCESS;
+	}
+
+#ifdef DDR_PHY_JIRA_ENABLE
+	if (effective_cs == (max_cs - 1)) {
+		/* adjust dqs to be as cs0 */
+		for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+			VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+			for (subphy_num = 0; subphy_num < subphy_max; subphy_num++) {
+				VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+				pbs_rx_avg = 0;
+				/* find average of all pbs of dqs and read ctr_x */
+				for (cs_index = 0; cs_index < max_cs; cs_index++) {
+					status = ddr3_tip_bus_read(dev_num, if_id, ACCESS_TYPE_UNICAST,
+								   subphy_num, DDR_PHY_DATA,
+								   0x54 + cs_index * 0x10,
+								   &pbs_temp[cs_index]);
+					if (status != MV_OK)
+						return status;
+
+					status = ddr3_tip_bus_read(dev_num, if_id, ACCESS_TYPE_UNICAST,
+								   subphy_num, DDR_PHY_DATA,
+								   0x3 + cs_index * 0x4,
+								   &ctr_x[cs_index]);
+					if (status != MV_OK)
+						return status;
+
+					pbs_rx_avg = pbs_rx_avg + pbs_temp[cs_index];
+				}
+
+				pbs_rx_avg = pbs_rx_avg / max_cs;
+
+				/* update pbs and ctr_x */
+				lambda_avg = (lambda_per_dq[if_id][subphy_num][0] +
+					      lambda_per_dq[if_id][subphy_num][1] +
+					      lambda_per_dq[if_id][subphy_num][2] +
+					      lambda_per_dq[if_id][subphy_num][3] +
+					      lambda_per_dq[if_id][subphy_num][4] +
+					      lambda_per_dq[if_id][subphy_num][5] +
+					      lambda_per_dq[if_id][subphy_num][6] +
+					      lambda_per_dq[if_id][subphy_num][7]) / 8;
+
+				for (cs_index = 0; cs_index < max_cs; cs_index++) {
+					status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST,
+								    0, ACCESS_TYPE_UNICAST,
+								    subphy_num, DDR_PHY_DATA,
+								    0x54 + cs_index * 0x10, pbs_rx_avg);
+					if (status != MV_OK)
+						return status;
+
+					status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST,
+								    0, ACCESS_TYPE_UNICAST,
+								    subphy_num, DDR_PHY_DATA,
+								    0x55 + cs_index * 0x10, pbs_rx_avg);
+					if (status != MV_OK)
+						return status;
+
+					/* update */
+					if (pbs_rx_avg >= pbs_temp[cs_index]) {
+						delta = ((pbs_rx_avg - pbs_temp[cs_index]) * lambda_avg) /
+							PBS_VAL_FACTOR;
+						if (ctr_x[cs_index] >= delta) {
+							ctr_x[cs_index] = ctr_x[cs_index] - delta;
+						} else {
+							ctr_x[cs_index] = 0;
+							DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+									  ("jira ddrphy56 extend fix(-) required %d\n",
+									   delta));
+						}
+					} else {
+						delta = ((pbs_temp[cs_index] - pbs_rx_avg) * lambda_avg) /
+							PBS_VAL_FACTOR;
+						if ((ctr_x[cs_index] + delta) > 32) {
+							ctr_x[cs_index] = 32;
+							DEBUG_CALIBRATION(DEBUG_LEVEL_INFO,
+									  ("jira ddrphy56 extend fix(+) required %d\n",
+									   delta));
+						} else {
+							ctr_x[cs_index] = (ctr_x[cs_index] + delta);
+						}
+					}
+					status = ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id,
+								    ACCESS_TYPE_UNICAST, subphy_num, DDR_PHY_DATA,
+								    CRX_PHY_REG(effective_cs),
+								    ctr_x[cs_index]);
+					if (status != MV_OK)
+						return status;
+				}
+			}
+		}
+	}
+#endif
+
+    return MV_OK;
+}
+
+#define MAX_LOOPS			2 /* maximum number of loops to get to solution */
+#define LEAST_SIGNIFICANT_BYTE_MASK	0xff
+#define VW_SUBPHY_LIMIT_MIN		0
+#define VW_SUBPHY_LIMIT_MAX		127
+#define MAX_PBS_NUM			31 /* TODO: added by another patch */
+enum{
+	LOCKED,
+	UNLOCKED
+};
+enum {
+	PASS,
+	FAIL
+};
+
+int mv_ddr4_dm_tuning(u32 cs, u16 (*pbs_tap_factor)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS])
+{
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	enum hws_training_ip_stat training_result;
+	enum hws_training_result result_type = RESULT_PER_BIT;
+	enum hws_search_dir search_dir;
+	enum hws_dir dir = OPER_WRITE;
+	int vw_sphy_hi_diff = 0;
+	int vw_sphy_lo_diff = 0;
+	int x, y;
+	int status;
+	unsigned int a, b, c;
+	u32 ctx_vector[MAX_BUS_NUM];
+	u32 subphy, bit, pattern;
+	u32 *result[MAX_BUS_NUM][HWS_SEARCH_DIR_LIMIT];
+	u32 max_win_size = MAX_WINDOW_SIZE_TX;
+	u32 dm_lambda[MAX_BUS_NUM] = {0};
+	u32 loop;
+	u32 adll_tap;
+	u32 dm_pbs, max_pbs;
+	u32 dq_pbs[BUS_WIDTH_IN_BITS];
+	u32 new_dq_pbs[BUS_WIDTH_IN_BITS];
+	u32 dq, pad;
+	u32 dq_pbs_diff;
+	u32 byte_center, dm_center;
+	u32 idx, reg_val;
+	u32 dm_pad = mv_ddr_dm_pad_get();
+	u8 subphy_max = ddr3_tip_dev_attr_get(0, MV_ATTR_OCTET_PER_INTERFACE);
+	u8 dm_vw_vector[MAX_BUS_NUM * ADLL_TAPS_PER_PERIOD];
+	u8 vw_sphy_lo_lmt[MAX_BUS_NUM];
+	u8 vw_sphy_hi_lmt[MAX_BUS_NUM];
+	u8 dm_status[MAX_BUS_NUM];
+
+	/* init */
+	for (subphy = 0; subphy < subphy_max; subphy++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+		dm_status[subphy] = UNLOCKED;
+		for (bit = 0 ; bit < BUS_WIDTH_IN_BITS; bit++)
+			dm_lambda[subphy] += pbs_tap_factor[0][subphy][bit];
+		dm_lambda[subphy] /= BUS_WIDTH_IN_BITS;
+	}
+
+	/* get algorithm's adll result */
+	for (subphy = 0; subphy < subphy_max; subphy++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+		ddr3_tip_bus_read(0, 0, ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+				  CTX_PHY_REG(cs), &reg_val);
+		ctx_vector[subphy] = reg_val;
+	}
+
+	for (loop = 0; loop < MAX_LOOPS; loop++) {
+		for (subphy = 0; subphy < subphy_max; subphy++) {
+			vw_sphy_lo_lmt[subphy] = VW_SUBPHY_LIMIT_MIN;
+			vw_sphy_hi_lmt[subphy] = VW_SUBPHY_LIMIT_MAX;
+			for (adll_tap = 0; adll_tap < ADLL_TAPS_PER_PERIOD; adll_tap++) {
+				idx = subphy * ADLL_TAPS_PER_PERIOD + adll_tap;
+				dm_vw_vector[idx] = PASS;
+			}
+		}
+
+		/* get valid window of dm signal */
+		mv_ddr_dm_vw_get(PATTERN_ZERO, cs, dm_vw_vector);
+		mv_ddr_dm_vw_get(PATTERN_ONE, cs, dm_vw_vector);
+
+		/* get vw for dm disable */
+		pattern = MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask) ? 73 : 23;
+		ddr3_tip_ip_training_wrapper(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ACCESS_TYPE_MULTICAST,
+					     PARAM_NOT_CARE, result_type, HWS_CONTROL_ELEMENT_ADLL, PARAM_NOT_CARE,
+					     dir, tm->if_act_mask, 0x0, max_win_size - 1, max_win_size - 1, pattern,
+					     EDGE_FPF, CS_SINGLE, PARAM_NOT_CARE, &training_result);
+
+		/* find skew of dm signal vs. dq data bits using its valid window */
+		for (subphy = 0; subphy < subphy_max; subphy++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+			ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+					   CTX_PHY_REG(cs), ctx_vector[subphy]);
+
+			for (search_dir = HWS_LOW2HIGH; search_dir <= HWS_HIGH2LOW; search_dir++) {
+				ddr3_tip_read_training_result(0, 0, ACCESS_TYPE_UNICAST, subphy,
+							      ALL_BITS_PER_PUP, search_dir, dir, result_type,
+							      TRAINING_LOAD_OPERATION_UNLOAD, CS_SINGLE,
+							      &(result[subphy][search_dir]),
+							      1, 0, 0);
+				DEBUG_DM_TUNING(DEBUG_LEVEL_INFO,
+						("dm cs %d if %d subphy %d result: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+						 cs, 0, subphy,
+						 result[subphy][search_dir][0],
+						 result[subphy][search_dir][1],
+						 result[subphy][search_dir][2],
+						 result[subphy][search_dir][3],
+						 result[subphy][search_dir][4],
+						 result[subphy][search_dir][5],
+						 result[subphy][search_dir][6],
+						 result[subphy][search_dir][7]));
+			}
+
+			if (dm_status[subphy] == LOCKED)
+				continue;
+
+			for (bit = 0; bit < BUS_WIDTH_IN_BITS; bit++) {
+				result[subphy][HWS_LOW2HIGH][bit] &= LEAST_SIGNIFICANT_BYTE_MASK;
+				result[subphy][HWS_HIGH2LOW][bit] &= LEAST_SIGNIFICANT_BYTE_MASK;
+
+				if (result[subphy][HWS_LOW2HIGH][bit] > vw_sphy_lo_lmt[subphy])
+					vw_sphy_lo_lmt[subphy] = result[subphy][HWS_LOW2HIGH][bit];
+
+				if (result[subphy][HWS_HIGH2LOW][bit] < vw_sphy_hi_lmt[subphy])
+					vw_sphy_hi_lmt[subphy] = result[subphy][HWS_HIGH2LOW][bit];
+			}
+
+			DEBUG_DM_TUNING(DEBUG_LEVEL_INFO,
+					("loop %d, dm subphy %d, vw %d, %d\n", loop, subphy,
+					 vw_sphy_lo_lmt[subphy], vw_sphy_hi_lmt[subphy]));
+
+			idx = subphy * ADLL_TAPS_PER_PERIOD;
+			status = mv_ddr_dm_to_dq_diff_get(vw_sphy_hi_lmt[subphy], vw_sphy_lo_lmt[subphy],
+							  &dm_vw_vector[idx], &vw_sphy_hi_diff, &vw_sphy_lo_diff);
+			if (status != MV_OK)
+				return MV_FAIL;
+			DEBUG_DM_TUNING(DEBUG_LEVEL_INFO,
+					("vw_sphy_lo_diff %d, vw_sphy_hi_diff %d\n",
+					 vw_sphy_lo_diff, vw_sphy_hi_diff));
+
+			/* dm is the strongest signal */
+			if ((vw_sphy_hi_diff >= 0) &&
+			    (vw_sphy_lo_diff >= 0)) {
+				dm_status[subphy] = LOCKED;
+			} else if ((vw_sphy_hi_diff >= 0) &&
+				   (vw_sphy_lo_diff < 0) &&
+				   (loop == 0)) { /* update dm only */
+				ddr3_tip_bus_read(0, 0, ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+						  PBS_TX_PHY_REG(cs, dm_pad), &dm_pbs);
+				x = -vw_sphy_lo_diff; /* get positive x */
+				a = (unsigned int)x * PBS_VAL_FACTOR;
+				b = dm_lambda[subphy];
+				if (round_div(a, b, &c) != MV_OK)
+					return MV_FAIL;
+				dm_pbs += (u32)c;
+				dm_pbs = (dm_pbs > MAX_PBS_NUM) ? MAX_PBS_NUM : dm_pbs;
+				ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST,
+						   subphy, DDR_PHY_DATA,
+						   PBS_TX_PHY_REG(cs, dm_pad), dm_pbs);
+			} else if ((vw_sphy_hi_diff < 0) &&
+				   (vw_sphy_lo_diff >= 0) &&
+				   (loop == 0)) { /* update dq and c_opt */
+				max_pbs = 0;
+				for (dq = 0; dq < BUS_WIDTH_IN_BITS; dq++) {
+					idx = dq + subphy * BUS_WIDTH_IN_BITS;
+					pad = dq_map_table[idx];
+					ddr3_tip_bus_read(0, 0, ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+							  PBS_TX_PHY_REG(cs, pad), &reg_val);
+					dq_pbs[dq] = reg_val;
+					x = -vw_sphy_hi_diff; /* get positive x */
+					a = (unsigned int)x * PBS_VAL_FACTOR;
+					b = pbs_tap_factor[0][subphy][dq];
+					if (round_div(a, b, &c) != MV_OK)
+						return MV_FAIL;
+					new_dq_pbs[dq] = dq_pbs[dq] + (u32)c;
+					if (max_pbs < new_dq_pbs[dq])
+						max_pbs = new_dq_pbs[dq];
+				}
+
+				dq_pbs_diff = (max_pbs > MAX_PBS_NUM) ? (max_pbs - MAX_PBS_NUM) : 0;
+				for (dq = 0; dq < BUS_WIDTH_IN_BITS; dq++) {
+					idx = dq + subphy * BUS_WIDTH_IN_BITS;
+					reg_val = new_dq_pbs[dq] - dq_pbs_diff;
+					if (reg_val < 0) {
+						DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+								("unexpected negative value found\n"));
+						return MV_FAIL;
+					}
+					pad = dq_map_table[idx];
+					ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0,
+							   ACCESS_TYPE_UNICAST, subphy,
+							   DDR_PHY_DATA,
+							   PBS_TX_PHY_REG(cs, pad),
+							   reg_val);
+				}
+
+				a = dm_lambda[subphy];
+				b = dq_pbs_diff * PBS_VAL_FACTOR;
+				if (b > 0) {
+					if (round_div(a, b, &c) != MV_OK)
+						return MV_FAIL;
+					dq_pbs_diff = (u32)c;
+				}
+
+				x = (int)ctx_vector[subphy];
+				if (x < 0) {
+					DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+							("unexpected negative value found\n"));
+					return MV_FAIL;
+				}
+				y = (int)dq_pbs_diff;
+				if (y < 0) {
+					DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+							("unexpected negative value found\n"));
+					return MV_FAIL;
+				}
+				x += (y + vw_sphy_hi_diff) / 2;
+				x %= ADLL_TAPS_PER_PERIOD;
+				ctx_vector[subphy] = (u32)x;
+			} else if (((vw_sphy_hi_diff < 0) && (vw_sphy_lo_diff < 0)) ||
+				   (loop == 1)) { /* dm is the weakest signal */
+				/* update dq and c_opt */
+				dm_status[subphy] = LOCKED;
+				byte_center = (vw_sphy_lo_lmt[subphy] + vw_sphy_hi_lmt[subphy]) / 2;
+				x = (int)byte_center;
+				if (x < 0) {
+					DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+							("unexpected negative value found\n"));
+					return MV_FAIL;
+				}
+				x += (vw_sphy_hi_diff - vw_sphy_lo_diff) / 2;
+				if (x < 0) {
+					DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+							("unexpected negative value found\n"));
+					return MV_FAIL;
+				}
+				dm_center = (u32)x;
+
+				if (byte_center > dm_center) {
+					max_pbs = 0;
+					for (dq = 0; dq < BUS_WIDTH_IN_BITS; dq++) {
+						pad = dq_map_table[dq + subphy * BUS_WIDTH_IN_BITS];
+						ddr3_tip_bus_read(0, 0, ACCESS_TYPE_UNICAST,
+								  subphy, DDR_PHY_DATA,
+								  PBS_TX_PHY_REG(cs, pad),
+								  &reg_val);
+						dq_pbs[dq] = reg_val;
+						a = (byte_center - dm_center) * PBS_VAL_FACTOR;
+						b = pbs_tap_factor[0][subphy][dq];
+						if (round_div(a, b, &c) != MV_OK)
+							return MV_FAIL;
+						new_dq_pbs[dq] = dq_pbs[dq] + (u32)c;
+						if (max_pbs < new_dq_pbs[dq])
+							max_pbs = new_dq_pbs[dq];
+					}
+
+					dq_pbs_diff = (max_pbs > MAX_PBS_NUM) ? (max_pbs - MAX_PBS_NUM) : 0;
+					for (int dq = 0; dq < BUS_WIDTH_IN_BITS; dq++) {
+						idx = dq + subphy * BUS_WIDTH_IN_BITS;
+						pad = dq_map_table[idx];
+						reg_val = new_dq_pbs[dq] - dq_pbs_diff;
+						if (reg_val < 0) {
+							DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+									("unexpected negative value found\n"));
+							return MV_FAIL;
+						}
+						ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0,
+								   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+								   PBS_TX_PHY_REG(cs, pad),
+								   reg_val);
+					}
+					ctx_vector[subphy] = dm_center % ADLL_TAPS_PER_PERIOD;
+				} else {
+					ddr3_tip_bus_read(0, 0, ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+							  PBS_TX_PHY_REG(cs, dm_pad), &dm_pbs);
+					a = (dm_center - byte_center) * PBS_VAL_FACTOR;
+					b = dm_lambda[subphy];
+					if (round_div(a, b, &c) != MV_OK)
+						return MV_FAIL;
+					dm_pbs += (u32)c;
+					ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0,
+							   ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+							   PBS_TX_PHY_REG(cs, dm_pad), dm_pbs);
+				}
+			} else {
+				/* below is the check whether dm signal per subphy converged or not */
+			}
+		}
+	}
+
+	for (subphy = 0; subphy < subphy_max; subphy++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+		ddr3_tip_bus_write(0, ACCESS_TYPE_UNICAST, 0, ACCESS_TYPE_UNICAST, subphy, DDR_PHY_DATA,
+				   CTX_PHY_REG(cs), ctx_vector[subphy]);
+	}
+
+	for (subphy = 0; subphy < subphy_max; subphy++) {
+		VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy);
+		if (dm_status[subphy] != LOCKED) {
+			DEBUG_DM_TUNING(DEBUG_LEVEL_ERROR,
+					("no convergence for dm signal[%u] found\n", subphy));
+			return MV_FAIL;
+		}
+	}
+
+	return MV_OK;
+}
+void refresh(void)
+{
+	u32 data_read[MAX_INTERFACE_NUM];
+	ddr3_tip_if_read(0, ACCESS_TYPE_UNICAST, 0, ODPG_DATA_CTRL_REG, data_read, MASK_ALL_BITS);
+
+	/* Refresh Command for CS0*/
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, ODPG_DATA_CTRL_REG, (0 << 26), (3 << 26));
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, SDRAM_OP_REG, 0xe02, 0xf1f);
+	if (ddr3_tip_if_polling(0, ACCESS_TYPE_UNICAST, 0, 0, 0x1f, SDRAM_OP_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("DDR3 poll failed"));
+
+	/* Refresh Command for CS1*/
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, ODPG_DATA_CTRL_REG, (1 << 26), (3 << 26));
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, SDRAM_OP_REG, 0xd02, 0xf1f);
+	if (ddr3_tip_if_polling(0, ACCESS_TYPE_UNICAST, 0, 0, 0x1f, SDRAM_OP_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+			DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("DDR3 poll failed"));
+
+	/* Restore Register*/
+	ddr3_tip_if_write(0, ACCESS_TYPE_UNICAST, 0, ODPG_DATA_CTRL_REG, data_read[0] , MASK_ALL_BITS);
+}
+#endif /* CONFIG_DDR4 */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h b/drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
new file mode 100644
index 0000000000..da4a866fe9
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#ifndef _MV_DDR4_TRAINING_CALIBRATION_H
+#define _MV_DDR4_TRAINING_CALIBRATION_H
+
+/* vref subphy calibration state */
+enum mv_ddr4_vref_subphy_cal_state {
+	MV_DDR4_VREF_SUBPHY_CAL_ABOVE,
+	MV_DDR4_VREF_SUBPHY_CAL_UNDER,
+	MV_DDR4_VREF_SUBPHY_CAL_INSIDE,
+	MV_DDR4_VREF_SUBPHY_CAL_END
+};
+
+/* calibrate DDR4 dq vref (tx) */
+int mv_ddr4_dq_vref_calibration(u8 dev_num, u16 (*pbs_tap_factor)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS]);
+
+/* calibrate receiver (receiver duty cycle) */
+int mv_ddr4_receiver_calibration(u8 dev_num);
+
+/* tune dm signal */
+int mv_ddr4_dm_tuning(u32 cs, u16 (*pbs_tap_factor)[MAX_BUS_NUM][BUS_WIDTH_IN_BITS]);
+
+#endif /* _MV_DDR4_TRAINING_CALIBRATION_H */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training_db.c b/drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
new file mode 100644
index 0000000000..1baa63a2d8
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
@@ -0,0 +1,545 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#if defined(CONFIG_DDR4)
+
+/* DDR4 Training Database */
+
+#include "ddr_ml_wrapper.h"
+
+#include "mv_ddr_topology.h"
+#include "mv_ddr_training_db.h"
+#include "ddr_topology_def.h"
+
+/* list of allowed frequencies listed in order of enum mv_ddr_freq */
+static unsigned int freq_val[MV_DDR_FREQ_LAST] = {
+	130,	/* MV_DDR_FREQ_LOW_FREQ */
+	650,	/* MV_DDR_FREQ_650 */
+	666,	/* MV_DDR_FREQ_667 */
+	800,	/* MV_DDR_FREQ_800 */
+	933,	/* MV_DDR_FREQ_933 */
+	1066,	/* MV_DDR_FREQ_1066 */
+	900,	/* MV_DDR_FREQ_900 */
+	1000,	/* MV_DDR_FREQ_1000 */
+	1050,	/* MV_DDR_FREQ_1050 */
+	1200,	/* MV_DDR_FREQ_1200 */
+	1333,	/* MV_DDR_FREQ_1333 */
+	1466,	/* MV_DDR_FREQ_1466 */
+	1600	/* MV_DDR_FREQ_1600 */
+};
+
+unsigned int *mv_ddr_freq_tbl_get(void)
+{
+	return &freq_val[0];
+}
+
+u32 mv_ddr_freq_get(enum mv_ddr_freq freq)
+{
+	return freq_val[freq];
+}
+
+/* non-dbi mode - table for cl values per frequency for each speed bin index */
+static struct mv_ddr_cl_val_per_freq cl_table[] = {
+/*   130   650   667   800   933   1067   900   1000   1050   1200   1333   1466   1600 FREQ(MHz)*/
+/*   7.69  1.53  1.5   1.25  1.07  0.937  1.11	1	   0.95   0.83	 0.75	0.68   0.625 TCK(ns)*/
+	{{10,  10,	 10,   0,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_1600J */
+	{{10,  11,	 11,   0,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600K */
+	{{10,  12,	 12,   0,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_1600L */
+	{{10,  12,	 12,   12,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_1866L */
+	{{10,  12,	 12,   13,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_1866M */
+	{{10,  12,	 12,   14,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_1866N */
+	{{10,  10,	 10,   12,	 14,   14,	  14,	14,	   14,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2133N */
+	{{10,  9,	 9,    12,	 14,   15,	  14,	15,	   15,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2133P */
+	{{10,  10,	 10,   12,	 14,   16,	  14,	16,	   16,	  0,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2133R */
+	{{10,  10,	 10,   12,	 14,   16,	  14,	16,	   16,	  18,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2400P */
+	{{10,  9,	 9,    11,	 13,   15,	  13,	15,	   15,	  18,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2400R */
+	{{10,  9,	 9,    11,	 13,   15,	  13,	15,	   15,	  17,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2400T */
+	{{10,  10,	 10,   12,	 14,   16,	  14,	16,	   16,	  18,	 0,	    0,	   0} },/* SPEED_BIN_DDR_2400U */
+	{{10,  10,   10,   11,   13,   15,    13,   15,    15,    16,    17,    0,     0} },/* SPEED_BIN_DDR_2666T */
+	{{10,  9,    10,   11,   13,   15,    13,   15,    15,    17,    18,    0,     0} },/* SPEED_BIN_DDR_2666U */
+	{{10,  9,    10,   12,   14,   16,    14,   16,    16,    18,    19,    0,     0} },/* SPEED_BIN_DDR_2666V */
+	{{10,  10,   10,   12,   14,   16,    14,   16,    16,    18,    20,    0,     0} },/* SPEED_BIN_DDR_2666W */
+	{{10,  10,   9,    11,   13,   15,    13,   15,    15,    16,    18,    19,    0} },/* SPEED_BIN_DDR_2933V */
+	{{10,  9,    10,   11,   13,   15,    13,   15,    15,    17,    19,    20,    0} },/* SPEED_BIN_DDR_2933W */
+	{{10,  9,    10,   12,   14,   16,    14,   16,    16,    18,    20,    21,    0} },/* SPEED_BIN_DDR_2933Y */
+	{{10,  10,   10,   12,   14,   16,    14,   16,    16,    18,    20,    22,    0} },/* SPEED_BIN_DDR_2933AA*/
+	{{10,  10,   9,    11,   13,   15,    13,   15,    15,    16,    18,    20,    20} },/* SPEED_BIN_DDR_3200W */
+	{{10,  9,    0,    11,   13,   15,    13,   15,    15,    17,    19,    22,    22} },/* SPEED_BIN_DDR_3200AA*/
+	{{10,  9,    10,   12,   14,   16,    14,   16,    16,    18,    20,    24,    24} } /* SPEED_BIN_DDR_3200AC*/
+
+};
+
+u32 mv_ddr_cl_val_get(u32 index, u32 freq)
+{
+	return cl_table[index].cl_val[freq];
+}
+
+/* dbi mode - table for cl values per frequency for each speed bin index */
+struct mv_ddr_cl_val_per_freq cas_latency_table_dbi[] = {
+/*	 130   650   667   800   933   1067   900   1000   1050   1200   1333   1466   1600 FREQ(MHz)*/
+/*	 7.69  1.53  1.5   1.25  1.07  0.937  1.11  1      0.95   0.83   0.75   0.68   0.625 TCK(ns)*/
+	{{0,   12,	 12,   0,	 0,	   0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600J */
+	{{0,   13,	 13,   0,	 0,	   0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600K */
+	{{0,   14,	 14,   0,	 0,	   0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600L */
+	{{0,   14,	 14,   14,	 0,	   0,	  14,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1866L */
+	{{0,   14,	 14,   15,	 0,	   0,	  15,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1866M */
+	{{0,   14,	 14,   16,	 0,	   0,	  16,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1866N */
+	{{0,   12,	 12,   14,	 16,	  17,	  14,	17,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2133N */
+	{{0,   11,	 11,   14,	 16,	  18,	  14,	18,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2133P */
+	{{0,   12,	 12,   14,	 16,	  19,	  14,	19,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2133R */
+	{{0,   12,	 12,   14,	 16,	  19,	  14,	19,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400P */
+	{{0,   11,	 11,   13,	 15,	  18,	  13,	18,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400R */
+	{{0,   11,	 11,   13,	 15,	  18,	  13,	18,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400T */
+	{{0,   12,	 12,   14,	 16,	  19,	  14,	19,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400U */
+	{{10,  10,   11,   13,   15,   18,    13,   18,    18,    19,    20,    0,     0} },/* SPEED_BIN_DDR_2666T */
+	{{10,  9,    11,   13,   15,   18,    13,   18,    18,    20,    21,    0,     0} },/* SPEED_BIN_DDR_2666U */
+	{{10,  9,    12,   14,   16,   19,    14,   19,    19,    21,    22,    0,     0} },/* SPEED_BIN_DDR_2666V */
+	{{10,  10,   12,   14,   16,   19,    14,   19,    19,    21,    23,    0,     0} },/* SPEED_BIN_DDR_2666W */
+	{{10,  10,   11,   13,   15,   18,    15,   18,    18,    19,    21,    23,    0} },/* SPEED_BIN_DDR_2933V */
+	{{10,  9,    12,   13,   15,   18,    15,   18,    18,    20,    22,    24,    0} },/* SPEED_BIN_DDR_2933W */
+	{{10,  9,    12,   14,   16,   19,    16,   19,    19,    21,    23,    26,    0} },/* SPEED_BIN_DDR_2933Y */
+	{{10,  10,   12,   14,   16,   19,    16,   19,    19,    21,    23,    26,    0} },/* SPEED_BIN_DDR_2933AA*/
+	{{10,  10,   11,   13,   15,   18,    15,   18,    18,    19,    21,    24,    24} },/* SPEED_BIN_DDR_3200W */
+	{{10,  9,    0,    13,   15,   18,    15,   18,    18,    20,    22,    26,    26} },/* SPEED_BIN_DDR_3200AA*/
+	{{10,  9,    12,   14,   16,   19,    16,   19,    19,    21,    23,    28,    28} } /* SPEED_BIN_DDR_3200AC*/
+};
+
+/* table for cwl values per speed bin index */
+static struct mv_ddr_cl_val_per_freq cwl_table[] = {
+/*	 130   650   667   800   933   1067   900   1000   1050   1200   1333  1466   1600 FREQ(MHz)*/
+/*	7.69   1.53  1.5   1.25  1.07  0.937  1.11  1      0.95   0.83   0.75  0.68   0.625 TCK(ns)*/
+	{{9,   9,	 9,	   0,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600J */
+	{{9,   9,	 9,	   0,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600K */
+	{{9,   9,	 9,	   0,	 0,    0,	  0,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1600L */
+	{{9,   9,	 9,	   10,	 0,    0,	  10,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1866L */
+	{{9,   9,	 9,	   10,	 0,    0,	  10,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1866M */
+	{{9,   9,	 9,	   10,	 0,    0,	  10,	0,	   0,	  0,	 0,	    0,     0} },/* SPEED_BIN_DDR_1866N */
+	{{9,   9,	 9,	   9,	 10,   11,	  10,	11,	   10,	  11,	 0,	    0,     0} },/* SPEED_BIN_DDR_2133N */
+	{{9,   9,	 9,	   9,	 10,   11,	  10,	11,	   10,	  11,	 0,	    0,     0} },/* SPEED_BIN_DDR_2133P */
+	{{9,   9,	 9,	   10,	 10,   11,	  10,	11,	   10,	  11,	 0,	    0,     0} },/* SPEED_BIN_DDR_2133R */
+	{{9,   9,	 9,	   9,	 10,   11,	  10,	11,	   10,	  12,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400P */
+	{{9,   9,	 9,	   9,	 10,   11,	  10,	11,	   10,	  12,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400R */
+	{{9,   9,	 9,	   9,	 10,   11,	  10,	11,	   10,	  12,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400T */
+	{{9,   9,	 9,	   9,	 10,   11,	  10,	11,	   10,	  12,	 0,	    0,     0} },/* SPEED_BIN_DDR_2400U */
+	{{10,  10,   9,    9,    10,   11,    10,   11,    11,    12,    14,    0,     0} },/* SPEED_BIN_DDR_2666T */
+	{{10,  9,    9,    9,    10,   11,    10,   11,    11,    12,    14,    0,     0} },/* SPEED_BIN_DDR_2666U */
+	{{10,  9,    9,    9,    10,   11,    10,   11,    11,    12,    14,    0,     0} },/* SPEED_BIN_DDR_2666V */
+	{{10,  10,   9,    9,    10,   11,    10,   11,    11,    12,    14,    0,     0} },/* SPEED_BIN_DDR_2666W */
+	{{10,  10,   9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    0} },/* SPEED_BIN_DDR_2933V */
+	{{10,  9,    9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    0} },/* SPEED_BIN_DDR_2933W */
+	{{10,  9,    9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    0} },/* SPEED_BIN_DDR_2933Y */
+	{{10,  10,   9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    0} },/* SPEED_BIN_DDR_2933AA*/
+	{{10,  10,   9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    16} },/* SPEED_BIN_DDR_3200W */
+	{{10,  9,    9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    16} },/* SPEED_BIN_DDR_3200AA*/
+	{{10,  9,    9,    9,    10,   11,    10,   11,    11,    12,    14,    16,    16} } /* SPEED_BIN_DDR_3200AC*/
+};
+
+u32 mv_ddr_cwl_val_get(u32 index, u32 freq)
+{
+	return cwl_table[index].cl_val[freq];
+}
+
+/*
+ * rfc values, ns
+ * note: values per JEDEC speed bin 1866; TODO: check it
+ */
+static unsigned int rfc_table[] = {
+	0,	/* placholder */
+	0,	/* placholder */
+	160,	/* 2G */
+	260,	/* 4G */
+	350,	/* 8G */
+	0,	/* TODO: placeholder for 16-Mbit die capacity */
+	0,	/* TODO: placeholder for 32-Mbit die capacity*/
+	0,	/* TODO: placeholder for 12-Mbit die capacity */
+	0	/* TODO: placeholder for 24-Mbit die capacity */
+};
+
+u32 mv_ddr_rfc_get(u32 mem)
+{
+	return rfc_table[mem];
+}
+
+u16 rtt_table[] = {
+	0xffff,
+	60,
+	120,
+	40,
+	240,
+	48,
+	80,
+	34
+};
+
+u8 twr_mask_table[] = {
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0xa,
+	0x0,	/* 10 */
+	0xa,
+	0x1,	/* 12 */
+	0xa,
+	0x2,	/* 14 */
+	0xa,
+	0x3,	/* 16 */
+	0xa,
+	0x4,	/* 18 */
+	0xa,
+	0x5,	/* 20 */
+	0xa,
+	0xa,	/* 22 */
+	0xa,
+	0x6	/* 24 */
+};
+
+u8 cl_mask_table[] = {
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x1,	/* 10 */
+	0x2,
+	0x3,	/* 12 */
+	0x4,
+	0x5,	/* 14 */
+	0x6,
+	0x7,	/* 16 */
+	0xd,
+	0x8,	/* 18 */
+	0x0,
+	0x9,	/* 20 */
+	0x0,
+	0xa,	/* 22 */
+	0x0,
+	0xb	/* 24 */
+};
+
+u8 cwl_mask_table[] = {
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x0,
+	0x1,	/* 10 */
+	0x2,
+	0x3,	/* 12 */
+	0x0,
+	0x4,	/* 14 */
+	0x0,
+	0x5,	/* 16 */
+	0x0,
+	0x6	/* 18 */
+};
+
+u32 speed_bin_table_t_rcd_t_rp[] = {
+	12500,
+	13750,
+	15000,
+	12850,
+	13920,
+	15000,
+	13130,
+	14060,
+	15000,
+	12500,
+	13320,
+	14160,
+	15000,
+	12750,
+	13500,
+	14250,
+	15000,
+	12960,
+	13640,
+	14320,
+	15000,
+	12500,
+	13750,
+	15000
+};
+
+u32 speed_bin_table_t_rc[] = {
+	47500,
+	48750,
+	50000,
+	46850,
+	47920,
+	49000,
+	46130,
+	47060,
+	48000,
+	44500,
+	45320,
+	46160,
+	47000,
+	44750,
+	45500,
+	46250,
+	47000,
+	44960,
+	45640,
+	46320,
+	47000,
+	44500,
+	45750,
+	47000
+};
+
+static struct mv_ddr_page_element page_tbl[] = {
+	/* 8-bit, 16-bit page size */
+	{MV_DDR_PAGE_SIZE_1K, MV_DDR_PAGE_SIZE_2K}, /* 512M */
+	{MV_DDR_PAGE_SIZE_1K, MV_DDR_PAGE_SIZE_2K}, /* 1G */
+	{MV_DDR_PAGE_SIZE_1K, MV_DDR_PAGE_SIZE_2K}, /* 2G */
+	{MV_DDR_PAGE_SIZE_1K, MV_DDR_PAGE_SIZE_2K}, /* 4G */
+	{MV_DDR_PAGE_SIZE_1K, MV_DDR_PAGE_SIZE_2K}, /* 8G */
+	{0, 0}, /* TODO: placeholder for 16-Mbit die capacity */
+	{0, 0}, /* TODO: placeholder for 32-Mbit die capacity */
+	{0, 0}, /* TODO: placeholder for 12-Mbit die capacity */
+	{0, 0}  /* TODO: placeholder for 24-Mbit die capacity */
+};
+
+u32 mv_ddr_page_size_get(enum mv_ddr_dev_width bus_width, enum mv_ddr_die_capacity mem_size)
+{
+	if (bus_width == MV_DDR_DEV_WIDTH_8BIT)
+		return page_tbl[mem_size].page_size_8bit;
+	else
+		return page_tbl[mem_size].page_size_16bit;
+}
+
+/* DLL locking time, tDLLK */
+#define MV_DDR_TDLLK_DDR4_1600	597
+#define MV_DDR_TDLLK_DDR4_1866	597
+#define MV_DDR_TDLLK_DDR4_2133	768
+#define MV_DDR_TDLLK_DDR4_2400	768
+#define MV_DDR_TDLLK_DDR4_2666	854
+#define MV_DDR_TDLLK_DDR4_2933	940
+#define MV_DDR_TDLLK_DDR4_3200	1024
+static int mv_ddr_tdllk_get(unsigned int freq, unsigned int *tdllk)
+{
+	if (freq >= 1600)
+		*tdllk = MV_DDR_TDLLK_DDR4_3200;
+	else if (freq >= 1466)
+		*tdllk = MV_DDR_TDLLK_DDR4_2933;
+	else if (freq >= 1333)
+		*tdllk = MV_DDR_TDLLK_DDR4_2666;
+	else if (freq >= 1200)
+		*tdllk = MV_DDR_TDLLK_DDR4_2400;
+	else if (freq >= 1066)
+		*tdllk = MV_DDR_TDLLK_DDR4_2133;
+	else if (freq >= 933)
+		*tdllk = MV_DDR_TDLLK_DDR4_1866;
+	else if (freq >= 800)
+		*tdllk = MV_DDR_TDLLK_DDR4_1600;
+	else {
+		printf("error: %s: unsupported data rate found\n", __func__);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* return speed bin value for selected index and element */
+unsigned int mv_ddr_speed_bin_timing_get(enum mv_ddr_speed_bin index, enum mv_ddr_speed_bin_timing element)
+{
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	unsigned int freq;
+	u32 result = 0;
+
+	/* get frequency in MHz */
+	freq = mv_ddr_freq_get(tm->interface_params[0].memory_freq);
+
+	switch (element) {
+	case SPEED_BIN_TRCD:
+	case SPEED_BIN_TRP:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRCD_MIN];
+		else
+			result = speed_bin_table_t_rcd_t_rp[index];
+		break;
+	case SPEED_BIN_TRAS:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRAS_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 35000;
+			else if (index <= SPEED_BIN_DDR_1866N)
+				result = 34000;
+			else if (index <= SPEED_BIN_DDR_2133R)
+				result = 33000;
+			else
+				result = 32000;
+		}
+		break;
+	case SPEED_BIN_TRC:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRC_MIN];
+		else
+			result = speed_bin_table_t_rc[index];
+		break;
+	case SPEED_BIN_TRRD0_5K:
+	case SPEED_BIN_TRRD1K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRRD_S_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 5000;
+			else if (index <= SPEED_BIN_DDR_1866N)
+				result = 4200;
+			else if (index <= SPEED_BIN_DDR_2133R)
+				result = 3700;
+			else if (index <= SPEED_BIN_DDR_2400U)
+				result = 3500;
+			else if (index <= SPEED_BIN_DDR_2666W)
+				result = 3000;
+			else if (index <= SPEED_BIN_DDR_2933AA)
+				result = 2700;
+			else
+				result = 2500;
+		}
+	        break;
+	case SPEED_BIN_TRRD2K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRRD_S_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 6000;
+			else
+				result = 5300;
+		}
+
+		break;
+	case SPEED_BIN_TRRDL0_5K:
+	case SPEED_BIN_TRRDL1K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRRD_L_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 6000;
+			else if (index <= SPEED_BIN_DDR_2133R)
+				result = 5300;
+			else
+				result = 4900;
+		}
+		break;
+	case SPEED_BIN_TRRDL2K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TRRD_L_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 7500;
+			else
+				result = 6400;
+		}
+	        break;
+	case SPEED_BIN_TPD:
+		result = 5000;
+		break;
+	case SPEED_BIN_TFAW0_5K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TFAW_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 20000;
+			else if (index <= SPEED_BIN_DDR_1866N)
+				result = 17000;
+			else if (index <= SPEED_BIN_DDR_2133R)
+				result = 15000;
+			else if (index <= SPEED_BIN_DDR_2400U)
+				result = 13000;
+			else if (index <= SPEED_BIN_DDR_2666W)
+				result = 12000;
+			else if (index <= SPEED_BIN_DDR_2933AA)
+				result = 10875;
+			else
+				result = 10000;
+		}
+	        break;
+	case SPEED_BIN_TFAW1K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TFAW_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 25000;
+			else if (index <= SPEED_BIN_DDR_1866N)
+				result = 23000;
+			else
+				result = 21000;
+		}
+	        break;
+	case SPEED_BIN_TFAW2K:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TFAW_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 35000;
+			else
+				result = 30000;
+		}
+		break;
+	case SPEED_BIN_TWTR:
+		result = 2500;
+		/* FIXME: wa: set twtr_s to a default value, if it's unset on spd */
+		if (tm->cfg_src == MV_DDR_CFG_SPD && tm->timing_data[MV_DDR_TWTR_S_MIN])
+			result = tm->timing_data[MV_DDR_TWTR_S_MIN];
+		break;
+	case SPEED_BIN_TWTRL:
+	case SPEED_BIN_TRTP:
+		result = 7500;
+		/* FIXME: wa: set twtr_l to a default value, if it's unset on spd */
+		if (tm->cfg_src == MV_DDR_CFG_SPD && tm->timing_data[MV_DDR_TWTR_L_MIN])
+			result = tm->timing_data[MV_DDR_TWTR_L_MIN];
+		break;
+	case SPEED_BIN_TWR:
+	case SPEED_BIN_TMOD:
+		result = 15000;
+		/* FIXME: wa: set twr to a default value, if it's unset on spd */
+		if (tm->cfg_src == MV_DDR_CFG_SPD && tm->timing_data[MV_DDR_TWR_MIN])
+			result = tm->timing_data[MV_DDR_TWR_MIN];
+		break;
+	case SPEED_BIN_TXPDLL:
+		result = 24000;
+		break;
+	case SPEED_BIN_TXSDLL:
+		if (mv_ddr_tdllk_get(freq, &result))
+			result = 0;
+		break;
+	case SPEED_BIN_TCCDL:
+		if (tm->cfg_src == MV_DDR_CFG_SPD)
+			result = tm->timing_data[MV_DDR_TCCD_L_MIN];
+		else {
+			if (index <= SPEED_BIN_DDR_1600L)
+				result = 6250;
+			else if (index <= SPEED_BIN_DDR_2133R)
+				result = 5355;
+			else
+				result = 5000;
+		}
+		break;
+	default:
+		printf("error: %s: invalid element [%d] found\n", __func__, (int)element);
+		break;
+	}
+
+	return result;
+}
+#endif /* CONFIG_DDR4 */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c b/drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
new file mode 100644
index 0000000000..6d5b942fa1
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
@@ -0,0 +1,441 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#if defined(CONFIG_DDR4)
+
+#include "ddr3_init.h"
+#include "mv_ddr_regs.h"
+
+static int mv_ddr4_dynamic_pb_wl_supp(u32 dev_num, enum mv_wl_supp_mode ecc_mode);
+
+/* compare test for ddr4 write leveling supplementary */
+#define MV_DDR4_COMP_TEST_NO_RESULT	0
+#define MV_DDR4_COMP_TEST_RESULT_0	1
+#define MV_DDR4_XSB_COMP_PATTERNS_NUM	8
+
+static u8 mv_ddr4_xsb_comp_test(u32 dev_num, u32 subphy_num, u32 if_id,
+				enum mv_wl_supp_mode ecc_mode)
+{
+	u32 wl_invert;
+	u8 pb_key, bit, bit_max, word;
+	struct pattern_info *pattern_table = ddr3_tip_get_pattern_table();
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	u32 subphy_max = ddr3_tip_dev_attr_get(0, MV_ATTR_OCTET_PER_INTERFACE);
+	uint64_t read_pattern_64[MV_DDR4_XSB_COMP_PATTERNS_NUM] = {0};
+	/*
+	 * FIXME: the pattern below is used for writing to the memory
+	 * by the cpu. it was changed to be written through the odpg.
+	 * for a workaround
+	 * uint64_t pattern_test_table_64[MV_DDR4_XSB_COMP_PATTERNS_NUM] = {
+	 *	0xffffffffffffffff,
+	 *	0xffffffffffffffff,
+	 *	0x0000000000000000,
+	 *	0x0000000000000000,
+	 *	0x0000000000000000,
+	 *	0x0000000000000000,
+	 *	0xffffffffffffffff,
+	 *	0xffffffffffffffff};
+	 */
+	u32 read_pattern[MV_DDR4_XSB_COMP_PATTERNS_NUM];
+	/*u32 pattern_test_table[MV_DDR4_XSB_COMP_PATTERNS_NUM] = {
+		0xffffffff,
+		0xffffffff,
+		0x00000000,
+		0x00000000,
+		0x00000000,
+		0x00000000,
+		0xffffffff,
+		0xffffffff};	TODO: use pattern_table_get_word */
+	int i, status;
+	uint64_t data64;
+	uintptr_t addr64;
+	int ecc_running = 0;
+	u32 ecc_read_subphy_num = 0; /* FIXME: change ecc read subphy num to be configurable */
+	u8 bit_counter = 0;
+	int edge = 0;
+	/* write and read data */
+	if (MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask)) {
+		status = ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ODPG_DATA_CTRL_REG,
+					   effective_cs << ODPG_DATA_CS_OFFS,
+					   ODPG_DATA_CS_MASK << ODPG_DATA_CS_OFFS);
+		if (status != MV_OK)
+			return status;
+
+		addr64 = (uintptr_t)pattern_table[PATTERN_TEST].start_addr;
+		/*
+		 * FIXME: changed the load pattern to memory through the odpg
+		 * this change is needed to be validate
+		 * this change is done due to un calibrated dm at this stage
+		 * the below code is the code for loading the pattern directly
+		 * to the memory
+		 *
+		 * for (i = 0; i < MV_DDR4_XSB_COMP_PATTERNS_NUM; i++) {
+		 *	data64 = pattern_test_table_64[i];
+		 *	writeq(addr64, data64);
+		 *	addr64 +=  sizeof(uint64_t);
+		 *}
+		 * FIXME: the below code loads the pattern to the memory through the odpg
+		 * it loads it twice to due supplementary failure, need to check it
+		 */
+		int j;
+		for (j = 0; j < 2; j++)
+			ddr3_tip_load_pattern_to_mem(dev_num, PATTERN_TEST);
+
+	} else if (MV_DDR_IS_32BIT_IN_64BIT_DRAM_MODE(tm->bus_act_mask, subphy_max)) {
+		status = ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ODPG_DATA_CTRL_REG,
+					       effective_cs << ODPG_DATA_CS_OFFS,
+					       ODPG_DATA_CS_MASK << ODPG_DATA_CS_OFFS);
+		if (status != MV_OK)
+			return status;
+
+		/*
+		 * FIXME: changed the load pattern to memory through the odpg
+		 * this change is needed to be validate
+		 * this change is done due to un calibrated dm at this stage
+		 * the below code is the code for loading the pattern directly
+		 * to the memory
+		 */
+		int j;
+		for (j = 0; j < 2; j++)
+			ddr3_tip_load_pattern_to_mem(dev_num, PATTERN_TEST);
+	} else {
+		/*
+		 * FIXME: changed the load pattern to memory through the odpg
+		 * this change is needed to be validate
+		 * this change is done due to un calibrated dm at this stage
+		 * the below code is the code for loading the pattern directly
+		 * to the memory
+		 */
+		int j;
+		for (j = 0; j < 2; j++)
+			ddr3_tip_load_pattern_to_mem(dev_num, PATTERN_TEST);
+	}
+
+	if ((ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP4) ||
+	    (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP3 ||
+	     ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP8)) {
+		/* disable ecc write mux */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_SW_2_REG, 0x0, 0x100);
+		if (status != MV_OK)
+			return status;
+
+		/* enable read data ecc mux */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_SW_2_REG, 0x3, 0x3);
+		if (status != MV_OK)
+			return status;
+
+		/* unset training start bit */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_REG, 0x80000000, 0x80000000);
+		if (status != MV_OK)
+			return status;
+
+		ecc_running = 1;
+		ecc_read_subphy_num = ECC_READ_BUS_0;
+	}
+
+	if (MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask)) {
+		status = ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ODPG_DATA_CTRL_REG,
+					       effective_cs << ODPG_DATA_CS_OFFS,
+					       ODPG_DATA_CS_MASK << ODPG_DATA_CS_OFFS);
+		if (status != MV_OK)
+			return status;
+		/*
+		 * in case of reading the pattern read it from the address x 8
+		 * the odpg multiply by 8 the addres to read from
+		 */
+		addr64 = ((uintptr_t)pattern_table[PATTERN_TEST].start_addr) << 3;
+		for (i = 0; i < MV_DDR4_XSB_COMP_PATTERNS_NUM; i++) {
+			data64 = readq(addr64);
+			addr64 +=  sizeof(uint64_t);
+			read_pattern_64[i] = data64;
+		}
+
+		DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("xsb comp: if %d bus id %d\n", 0, subphy_num));
+		for (edge = 0; edge < 8; edge++)
+			DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("0x%16llx\n", (unsigned long long)read_pattern_64[edge]));
+		DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("\n"));
+	} else if (MV_DDR_IS_32BIT_IN_64BIT_DRAM_MODE(tm->bus_act_mask, subphy_max)) {
+		status = ddr3_tip_if_write(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, ODPG_DATA_CTRL_REG,
+					       effective_cs << ODPG_DATA_CS_OFFS,
+					       ODPG_DATA_CS_MASK << ODPG_DATA_CS_OFFS);
+		if (status != MV_OK)
+			return status;
+
+		status = ddr3_tip_ext_read(dev_num, if_id, pattern_table[PATTERN_TEST].start_addr << 3,
+					   1, read_pattern);
+		if (status != MV_OK)
+			return status;
+
+		DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("xsb comp: if %d bus id %d\n", 0, subphy_num));
+		for (edge = 0; edge < 8; edge++)
+			DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("0x%16x\n", read_pattern[edge]));
+		DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("\n"));
+	} else {
+		status = ddr3_tip_ext_read(dev_num, if_id, ((pattern_table[PATTERN_TEST].start_addr << 3) +
+					    ((SDRAM_CS_SIZE + 1) * effective_cs)), 1, read_pattern);
+		if (status != MV_OK)
+			return status;
+
+		DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("xsb comp: if %d bus id %d\n", 0, subphy_num));
+		for (edge = 0; edge < 8; edge++)
+			DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("0x%16x\n", read_pattern[edge]));
+		DEBUG_LEVELING(DEBUG_LEVEL_INFO, ("\n"));
+	}
+
+	/* read centralization result to decide on half phase by inverse bit */
+	status = ddr3_tip_bus_read(dev_num, if_id, ACCESS_TYPE_UNICAST, subphy_num, DDR_PHY_DATA,
+				   CTX_PHY_REG(0), &wl_invert);
+	if (status != MV_OK)
+		return status;
+
+	if ((wl_invert & 0x20) != 0)
+		wl_invert = 1;
+	else
+		wl_invert = 0;
+
+	/* for ecc, read from the "read" subphy (usualy subphy 0) */
+	if (ecc_running)
+		subphy_num = ecc_read_subphy_num;
+
+	/* per bit loop*/
+	bit_max = subphy_num * BUS_WIDTH_IN_BITS + BUS_WIDTH_IN_BITS;
+	for (bit = subphy_num * BUS_WIDTH_IN_BITS; bit < bit_max; bit++) {
+		/* get per bit pattern key (value of the same bit in the pattern) */
+		pb_key = 0;
+		for (word = 0; word < MV_DDR4_XSB_COMP_PATTERNS_NUM; word++) {
+			if (MV_DDR_IS_64BIT_DRAM_MODE(tm->bus_act_mask)) {
+				if ((read_pattern_64[word] & ((uint64_t)1 << bit)) != 0)
+					pb_key |= (1 << word);
+			} else {
+				if ((read_pattern[word] & (1 << bit)) != 0)
+					pb_key |= (1 << word);
+			}
+		}
+
+		/* find the key value and make decision */
+		switch (pb_key) {
+		/* case(s) for 0 */
+		case 0b11000011:	/* nominal */
+		case 0b10000011:	/* sample at start of UI sample at the dqvref TH */
+		case 0b10000111:	/* sample at start of UI sample at the dqvref TH */
+		case 0b11000001:	/* sample at start of UI sample at the dqvref TH */
+		case 0b11100001:	/* sample at start of UI sample at the dqvref TH */
+		case 0b11100011:	/* sample at start of UI sample at the dqvref TH */
+		case 0b11000111:	/* sample at start of UI sample at the dqvref TH */
+			bit_counter++;
+			break;
+		} /* end of switch */
+	} /* end of per bit loop */
+
+	/* check all bits in the current subphy has met the switch condition above */
+	if (bit_counter == BUS_WIDTH_IN_BITS)
+		return MV_DDR4_COMP_TEST_RESULT_0;
+	else {
+		DEBUG_LEVELING(
+			       DEBUG_LEVEL_INFO,
+			       ("different supplementary results (%d -> %d)\n",
+			       MV_DDR4_COMP_TEST_NO_RESULT, MV_DDR4_COMP_TEST_RESULT_0));
+		return MV_DDR4_COMP_TEST_NO_RESULT;
+	}
+}
+
+int mv_ddr4_dynamic_wl_supp(u32 dev_num)
+{
+	int status = MV_OK;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+
+	if (DDR3_IS_ECC_PUP4_MODE(tm->bus_act_mask) ||
+	    DDR3_IS_ECC_PUP3_MODE(tm->bus_act_mask) ||
+	    DDR3_IS_ECC_PUP8_MODE(tm->bus_act_mask)) {
+		if (DDR3_IS_ECC_PUP4_MODE(tm->bus_act_mask))
+			status = mv_ddr4_dynamic_pb_wl_supp(dev_num, WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP4);
+		else if (DDR3_IS_ECC_PUP3_MODE(tm->bus_act_mask))
+			status = mv_ddr4_dynamic_pb_wl_supp(dev_num, WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP3);
+		else /* WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP8 */
+			status = mv_ddr4_dynamic_pb_wl_supp(dev_num, WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP8);
+		if (status != MV_OK)
+			return status;
+		status = mv_ddr4_dynamic_pb_wl_supp(dev_num, WRITE_LEVELING_SUPP_ECC_MODE_DATA_PUPS);
+	} else { /* regular supplementary for data subphys in non-ecc mode */
+		status = mv_ddr4_dynamic_pb_wl_supp(dev_num, WRITE_LEVELING_SUPP_REG_MODE);
+	}
+
+	return status;
+}
+
+/* dynamic per bit write leveling supplementary */
+static int mv_ddr4_dynamic_pb_wl_supp(u32 dev_num, enum mv_wl_supp_mode ecc_mode)
+{
+	u32 if_id;
+	u32 subphy_start, subphy_end;
+	u32 subphy_num = ddr3_tip_dev_attr_get(dev_num, MV_ATTR_OCTET_PER_INTERFACE);
+	u8 compare_result = 0;
+	u32 orig_phase;
+	u32 rd_data, wr_data = 0;
+	u32 flag, step;
+	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+	u32 ecc_phy_access_id;
+	int status;
+
+	if (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP4 ||
+	    ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP3 ||
+	    ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP8) {
+		/* enable ecc write mux */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_SW_2_REG, 0x100, 0x100);
+		if (status != MV_OK)
+			return status;
+
+		/* disable read data ecc mux */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_SW_2_REG, 0x0, 0x3);
+		if (status != MV_OK)
+			return status;
+
+		/* unset training start bit */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_REG, 0x0, 0x80000000);
+		if (status != MV_OK)
+			return status;
+
+		if (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP3)
+			ecc_phy_access_id = ECC_PHY_ACCESS_3;
+		else if (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP4)
+			ecc_phy_access_id = ECC_PHY_ACCESS_4;
+		else /* ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP8 */
+			ecc_phy_access_id = ECC_PHY_ACCESS_8;
+
+		subphy_start = ecc_phy_access_id;
+		subphy_end = subphy_start + 1;
+	} else if (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_DATA_PUPS) {
+		/* disable ecc write mux */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_SW_2_REG, 0x0, 0x100);
+		if (status != MV_OK)
+			return status;
+
+		/* disable ecc mode*/
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   SDRAM_CFG_REG, 0, 0x40000);
+		if (status != MV_OK)
+			return status;
+
+		subphy_start = 0;
+		if (MV_DDR_IS_HALF_BUS_DRAM_MODE(tm->bus_act_mask, subphy_num))
+			subphy_end = (subphy_num - 1) / 2;
+		else
+			subphy_end = subphy_num - 1;
+	} else { /* ecc_mode == WRITE_LEVELING_SUPP_REG_MODE */
+		subphy_start = 0;
+		/* remove ecc subphy prior to algorithm's start */
+		subphy_end = subphy_num - 1; /* TODO: check it */
+	}
+
+	for (if_id = 0; if_id < MAX_INTERFACE_NUM; if_id++) {
+		VALIDATE_IF_ACTIVE(tm->if_act_mask, if_id);
+		for (subphy_num = subphy_start; subphy_num < subphy_end; subphy_num++) {
+			VALIDATE_BUS_ACTIVE(tm->bus_act_mask, subphy_num);
+			flag = 1;
+			step = 0;
+			status = ddr3_tip_bus_read(dev_num, if_id, ACCESS_TYPE_UNICAST, subphy_num, DDR_PHY_DATA,
+						   WL_PHY_REG(effective_cs), &rd_data);
+			if (status != MV_OK)
+				return status;
+			orig_phase = (rd_data >> 6) & 0x7;
+			while (flag != 0) {
+				/* get decision for subphy */
+				compare_result = mv_ddr4_xsb_comp_test(dev_num, subphy_num, if_id, ecc_mode);
+				if (compare_result == MV_DDR4_COMP_TEST_RESULT_0) {
+					flag = 0;
+				} else { /* shift phase to -1 */
+					step++;
+					if (step == 1) { /* set phase (0x0[6-8]) to -2 */
+						if (orig_phase > 1)
+							wr_data = (rd_data & ~0x1c0) | ((orig_phase - 2) << 6);
+						else if (orig_phase == 1)
+							wr_data = (rd_data & ~0x1df);
+						if (orig_phase >= 1)
+							ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id,
+									   ACCESS_TYPE_UNICAST, subphy_num,
+									   DDR_PHY_DATA,
+									   WL_PHY_REG(effective_cs), wr_data);
+					} else if (step == 2) { /* shift phase to +1 */
+						if (orig_phase <= 5) {
+							wr_data = (rd_data & ~0x1c0) | ((orig_phase + 2) << 6);
+							ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id,
+									   ACCESS_TYPE_UNICAST, subphy_num,
+									   DDR_PHY_DATA,
+									   WL_PHY_REG(effective_cs), wr_data);
+						}
+					} else if (step == 3) {
+						if (orig_phase <= 3) {
+							wr_data = (rd_data & ~0x1c0) | ((orig_phase + 4) << 6);
+							ddr3_tip_bus_write(dev_num, ACCESS_TYPE_UNICAST, if_id,
+									   ACCESS_TYPE_UNICAST, subphy_num,
+									   DDR_PHY_DATA,
+									   WL_PHY_REG(effective_cs), wr_data);
+						}
+					} else { /* error */
+						flag = 0;
+						compare_result = MV_DDR4_COMP_TEST_NO_RESULT;
+						training_result[training_stage][if_id] = TEST_FAILED;
+					}
+				}
+			}
+		}
+		if ((training_result[training_stage][if_id] == NO_TEST_DONE) ||
+		    (training_result[training_stage][if_id] == TEST_SUCCESS))
+			training_result[training_stage][if_id] = TEST_SUCCESS;
+	}
+
+	if (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_DATA_PUPS) {
+		/* enable ecc write mux */
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   TRAINING_SW_2_REG, 0x100, 0x100);
+		if (status != MV_OK)
+			return status;
+
+		/* enable ecc mode*/
+		status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+					   SDRAM_CFG_REG, 0x40000, 0x40000);
+		if (status != MV_OK)
+			return status;
+	} else if (ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP4 ||
+		   ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP3 ||
+		   ecc_mode == WRITE_LEVELING_SUPP_ECC_MODE_ECC_PUP8) {
+			/* enable ecc write mux */
+			status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+						   TRAINING_SW_2_REG, 0x100, 0x100);
+			if (status != MV_OK)
+				return status;
+
+			/* disable read data ecc mux */
+			status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+						   TRAINING_SW_2_REG, 0x0, 0x3);
+			if (status != MV_OK)
+				return status;
+
+			/* unset training start bit */
+			status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+						   TRAINING_REG, 0x0, 0x80000000);
+			if (status != MV_OK)
+				return status;
+
+			status = ddr3_tip_if_write(dev_num, ACCESS_TYPE_UNICAST, PARAM_NOT_CARE,
+						   TRAINING_SW_1_REG, 0x1 << 16, 0x1 << 16);
+			if (status != MV_OK)
+				return status;
+	} else {
+		/* do nothing for WRITE_LEVELING_SUPP_REG_MODE */;
+	}
+	if (training_result[training_stage][0] == TEST_SUCCESS)
+		return MV_OK;
+	else
+		return MV_FAIL;
+}
+#endif /* CONFIG_DDR4 */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h b/drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
new file mode 100644
index 0000000000..4067cac968
--- /dev/null
+++ b/drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) Marvell International Ltd. and its affiliates
+ */
+
+#ifndef _MV_DDR4_TRAINING_LEVELING_H
+#define _MV_DDR4_TRAINING_LEVELING_H
+
+int mv_ddr4_dynamic_wl_supp(u32 dev_num);
+
+#endif /* _MV_DDR4_TRAINING_LEVELING_H */
diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
index 7c7bce73a3..16d177b42f 100644
--- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
+++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
@@ -12,6 +12,11 @@
 #define DDR_INTERFACES_NUM		1
 #define DDR_INTERFACE_OCTETS_NUM	5
 
+/* These were defined in ATF area that was stripped out */
+#define MV_STATUS	int
+#define MV_U32		u32
+#define MV_U8		u8
+
 /*
  * 1. L2 filter should be set at binary header to 0xD000000,
  *    to avoid conflict with internal register IO.
@@ -38,6 +43,24 @@
 #define TSEN_STATUS_TEMP_OUT_OFFSET	0
 #define TSEN_STATUS_TEMP_OUT_MASK	(0x3ff << TSEN_STATUS_TEMP_OUT_OFFSET)
 
+#if defined(CONFIG_DDR4)
+static struct dlb_config ddr3_dlb_config_table[] = {
+	{DLB_CTRL_REG, 0x2000005f},
+	{DLB_BUS_OPT_WT_REG, 0x00880000},
+	{DLB_AGING_REG, 0x3f7f007f},
+	{DLB_EVICTION_CTRL_REG, 0x0000129f},
+	{DLB_EVICTION_TIMERS_REG, 0x00ff0000},
+	{DLB_WTS_DIFF_CS_REG, 0x04030803},
+	{DLB_WTS_DIFF_BG_REG, 0x00000A02},
+	{DLB_WTS_SAME_BG_REG, 0x08000901},
+	{DLB_WTS_CMDS_REG,  0x00020005},
+	{DLB_WTS_ATTR_PRIO_REG, 0x00060f10},
+	{DLB_QUEUE_MAP_REG, 0x00000543},
+	{DLB_SPLIT_REG, 0x0000000f},
+	{DLB_USER_CMD_REG, 0x00000000},
+	{0x0, 0x0}
+};
+#else /* !CONFIG_DDR4 */
 static struct dlb_config ddr3_dlb_config_table[] = {
 	{DLB_CTRL_REG, 0x2000005c},
 	{DLB_BUS_OPT_WT_REG, 0x00880000},
@@ -54,6 +77,7 @@ static struct dlb_config ddr3_dlb_config_table[] = {
 	{DLB_USER_CMD_REG, 0x00000000},
 	{0x0, 0x0}
 };
+#endif /* CONFIG_DDR4 */
 
 static struct dlb_config *sys_env_dlb_config_ptr_get(void)
 {
@@ -62,12 +86,18 @@ static struct dlb_config *sys_env_dlb_config_ptr_get(void)
 
 static u8 a38x_bw_per_freq[MV_DDR_FREQ_LAST] = {
 	0x3,			/* MV_DDR_FREQ_100 */
+#if !defined(CONFIG_DDR4)
 	0x4,			/* MV_DDR_FREQ_400 */
 	0x4,			/* MV_DDR_FREQ_533 */
+#endif /* CONFIG_DDR4 */
 	0x5,			/* MV_DDR_FREQ_667 */
 	0x5,			/* MV_DDR_FREQ_800 */
 	0x5,			/* MV_DDR_FREQ_933 */
 	0x5,			/* MV_DDR_FREQ_1066 */
+#if defined(CONFIG_DDR4)
+	0x5,			/*MV_DDR_FREQ_900*/
+	0x5,			/*MV_DDR_FREQ_1000*/
+#else /* CONFIG_DDR4 */
 	0x3,			/* MV_DDR_FREQ_311 */
 	0x3,			/* MV_DDR_FREQ_333 */
 	0x4,			/* MV_DDR_FREQ_467 */
@@ -77,16 +107,23 @@ static u8 a38x_bw_per_freq[MV_DDR_FREQ_LAST] = {
 	0x5,			/* MV_DDR_FREQ_900 */
 	0x3,			/* MV_DDR_FREQ_360 */
 	0x5			/* MV_DDR_FREQ_1000 */
+#endif /* CONFIG_DDR4 */
 };
 
 static u8 a38x_rate_per_freq[MV_DDR_FREQ_LAST] = {
 	0x1,			/* MV_DDR_FREQ_100 */
+#if !defined(CONFIG_DDR4)
 	0x2,			/* MV_DDR_FREQ_400 */
 	0x2,			/* MV_DDR_FREQ_533 */
+#endif /* CONFIG_DDR4 */
 	0x2,			/* MV_DDR_FREQ_667 */
 	0x2,			/* MV_DDR_FREQ_800 */
 	0x3,			/* MV_DDR_FREQ_933 */
 	0x3,			/* MV_DDR_FREQ_1066 */
+#ifdef CONFIG_DDR4
+	0x2,			/*MV_DDR_FREQ_900*/
+	0x2,			/*MV_DDR_FREQ_1000*/
+#else /* CONFIG_DDR4 */
 	0x1,			/* MV_DDR_FREQ_311 */
 	0x1,			/* MV_DDR_FREQ_333 */
 	0x2,			/* MV_DDR_FREQ_467 */
@@ -96,6 +133,7 @@ static u8 a38x_rate_per_freq[MV_DDR_FREQ_LAST] = {
 	0x2,			/* MV_DDR_FREQ_900 */
 	0x1,			/* MV_DDR_FREQ_360 */
 	0x2			/* MV_DDR_FREQ_1000 */
+#endif /* CONFIG_DDR4 */
 };
 
 static u16 a38x_vco_freq_per_sar_ref_clk_25_mhz[] = {
@@ -166,6 +204,54 @@ static u16 a38x_vco_freq_per_sar_ref_clk_40_mhz[] = {
 	1800			/* 30 - 0x1E */
 };
 
+#if defined(CONFIG_DDR4)
+u16 odt_slope[] = {
+	21443,
+	1452,
+	482,
+	240,
+	141,
+	90,
+	67,
+	52
+};
+
+u16 odt_intercept[] = {
+	1517,
+	328,
+	186,
+	131,
+	100,
+	80,
+	69,
+	61
+};
+
+/* Map of scratch PHY registers used to store stability value */
+u32 dmin_phy_reg_table[MAX_BUS_NUM * MAX_CS_NUM][2] = {
+	/* subphy, addr */
+	{0, 0xc0},	/* cs 0, subphy 0 */
+	{0, 0xc1},	/* cs 0, subphy 1 */
+	{0, 0xc2},	/* cs 0, subphy 2 */
+	{0, 0xc3},	/* cs 0, subphy 3 */
+	{0, 0xc4},	/* cs 0, subphy 4 */
+	{1, 0xc0},	/* cs 1, subphy 0 */
+	{1, 0xc1},	/* cs 1, subphy 1 */
+	{1, 0xc2},	/* cs 1, subphy 2 */
+	{1, 0xc3},	/* cs 1, subphy 3 */
+	{1, 0xc4},	/* cs 1, subphy 4 */
+	{2, 0xc0},	/* cs 2, subphy 0 */
+	{2, 0xc1},	/* cs 2, subphy 1 */
+	{2, 0xc2},	/* cs 2, subphy 2 */
+	{2, 0xc3},	/* cs 2, subphy 3 */
+	{2, 0xc4},	/* cs 2, subphy 4 */
+	{0, 0xc5},	/* cs 3, subphy 0 */
+	{1, 0xc5},	/* cs 3, subphy 1 */
+	{2, 0xc5},	/* cs 3, subphy 2 */
+	{0, 0xc6},	/* cs 3, subphy 3 */
+	{1, 0xc6}	/* cs 3, subphy 4 */
+};
+#endif /* CONFIG_DDR4 */
 
 static u32 dq_bit_map_2_phy_pin[] = {
 	1, 0, 2, 6, 9, 8, 3, 7,	/* 0 */
@@ -397,6 +483,7 @@ static int mv_ddr_sar_freq_get(int dev_num, enum mv_ddr_freq *freq)
 	if (((ref_clk_satr >> DEVICE_SAMPLE_AT_RESET2_REG_REFCLK_OFFSET) & 0x1) ==
 	    DEVICE_SAMPLE_AT_RESET2_REG_REFCLK_25MHZ) {
 		switch (reg) {
+#if !defined(CONFIG_DDR4)
 		case 0x1:
 			DEBUG_TRAINING_ACCESS(DEBUG_LEVEL_ERROR,
 					      ("Warning: Unsupported freq mode for 333Mhz configured(%d)\n",
@@ -424,6 +511,7 @@ static int mv_ddr_sar_freq_get(int dev_num, enum mv_ddr_freq *freq)
 		case 0x6:
 			*freq = MV_DDR_FREQ_600;
 			break;
+#endif /* CONFIG_DDR4 */
 		case 0x11:
 		case 0x14:
 			DEBUG_TRAINING_ACCESS(DEBUG_LEVEL_ERROR,
@@ -448,21 +536,32 @@ static int mv_ddr_sar_freq_get(int dev_num, enum mv_ddr_freq *freq)
 		case 0x12:
 			*freq = MV_DDR_FREQ_900;
 			break;
+#if defined(CONFIG_DDR4)
+		case 0x13:
+			*freq = MV_DDR_FREQ_1000;
+			DEBUG_TRAINING_ACCESS(DEBUG_LEVEL_ERROR,
+					      ("Warning: Unsupported freq mode for 1000Mhz configured(%d)\n",
+					      reg));
+			break;
+#else /* CONFIG_DDR4 */
 		case 0x13:
 			*freq = MV_DDR_FREQ_933;
 			break;
+#endif /* CONFIG_DDR4 */
 		default:
 			*freq = 0;
 			return MV_NOT_SUPPORTED;
 		}
 	} else { /* REFCLK 40MHz case */
 		switch (reg) {
+#if !defined(CONFIG_DDR4)
 		case 0x3:
 			*freq = MV_DDR_FREQ_400;
 			break;
 		case 0x5:
 			*freq = MV_DDR_FREQ_533;
 			break;
+#endif /* CONFIG_DDR4 */
 		case 0xb:
 			*freq = MV_DDR_FREQ_800;
 			break;
@@ -478,6 +577,7 @@ static int mv_ddr_sar_freq_get(int dev_num, enum mv_ddr_freq *freq)
 	return MV_OK;
 }
 
+#if !defined(CONFIG_DDR4)
 static int ddr3_tip_a38x_get_medium_freq(int dev_num, enum mv_ddr_freq *freq)
 {
 	u32 reg, ref_clk_satr;
@@ -554,6 +654,7 @@ static int ddr3_tip_a38x_get_medium_freq(int dev_num, enum mv_ddr_freq *freq)
 
 	return MV_OK;
 }
+#endif /* CONFIG_DDR4 */
 
 static int ddr3_tip_a38x_get_device_info(u8 dev_num, struct ddr3_device_info *info_ptr)
 {
@@ -667,7 +768,9 @@ static int mv_ddr_sw_db_init(u32 dev_num, u32 board_id)
 	dfs_low_freq = DFS_LOW_FREQ_VALUE;
 	calibration_update_control = 1;
 
+#if !defined(CONFIG_DDR4)
 	ddr3_tip_a38x_get_medium_freq(dev_num, &medium_freq);
+#endif /* CONFIG_DDR4 */
 
 	return MV_OK;
 }
@@ -675,6 +778,29 @@ static int mv_ddr_sw_db_init(u32 dev_num, u32 board_id)
 static int mv_ddr_training_mask_set(void)
 {
 	struct mv_ddr_topology_map *tm = mv_ddr_topology_map_get();
+#if defined(CONFIG_DDR4)
+	mask_tune_func = (SET_LOW_FREQ_MASK_BIT |
+			  LOAD_PATTERN_MASK_BIT |
+			  SET_TARGET_FREQ_MASK_BIT |
+			  WRITE_LEVELING_TF_MASK_BIT |
+			  READ_LEVELING_TF_MASK_BIT |
+			  RECEIVER_CALIBRATION_MASK_BIT |
+			  WL_PHASE_CORRECTION_MASK_BIT |
+			  DQ_VREF_CALIBRATION_MASK_BIT);
+	/* Temporarily disable the DQ_MAPPING stage */
+	/*		  DQ_MAPPING_MASK_BIT */
+	rl_mid_freq_wa = 0;
+
+	/* In case A382, Vref calibration workaround isn't required */
+	if (((reg_read(DEV_ID_REG) & 0xFFFF0000) >> 16) == 0x6811) {
+		printf("vref_calibration_wa is disabled\n");
+		vref_calibration_wa = 0;
+	}
+
+	if (DDR3_IS_16BIT_DRAM_MODE(tm->bus_act_mask) == 1)
+		mask_tune_func &= ~WL_PHASE_CORRECTION_MASK_BIT;
+
+#else /* CONFIG_DDR4 */
 	enum mv_ddr_freq ddr_freq = tm->interface_params[0].memory_freq;
 
 	mask_tune_func = (SET_LOW_FREQ_MASK_BIT |
@@ -711,6 +837,7 @@ static int mv_ddr_training_mask_set(void)
 		mask_tune_func &= ~PBS_TX_MASK_BIT;
 		mask_tune_func &= ~PBS_RX_MASK_BIT;
 	}
+#endif /* CONFIG_DDR4 */
 
 	return MV_OK;
 }
@@ -767,6 +894,7 @@ static int ddr3_tip_a38x_set_divider(u8 dev_num, u32 if_id,
 
 		/* Set KNL values */
 		switch (frequency) {
+#ifndef CONFIG_DDR4 /* CONFIG_DDR3 */
 		case MV_DDR_FREQ_467:
 			async_val = 0x806f012;
 			break;
@@ -776,15 +904,18 @@ static int ddr3_tip_a38x_set_divider(u8 dev_num, u32 if_id,
 		case MV_DDR_FREQ_600:
 			async_val = 0x805f00a;
 			break;
+#endif
 		case MV_DDR_FREQ_667:
 			async_val = 0x809f012;
 			break;
 		case MV_DDR_FREQ_800:
 			async_val = 0x807f00a;
 			break;
+#ifndef CONFIG_DDR4 /* CONFIG_DDR3 */
 		case MV_DDR_FREQ_850:
 			async_val = 0x80cb012;
 			break;
+#endif
 		case MV_DDR_FREQ_900:
 			async_val = 0x80d7012;
 			break;
@@ -1293,6 +1424,12 @@ static int ddr3_new_tip_dlb_config(void)
 		i++;
 	}
 
+#if defined(CONFIG_DDR4)
+	reg = reg_read(DUNIT_CTRL_HIGH_REG);
+	reg &= ~(CPU_INTERJECTION_ENA_MASK << CPU_INTERJECTION_ENA_OFFS);
+	reg |= CPU_INTERJECTION_ENA_SPLIT_DIS << CPU_INTERJECTION_ENA_OFFS;
+	reg_write(DUNIT_CTRL_HIGH_REG, reg);
+#endif /* CONFIG_DDR4 */
 
 	/* Enable DLB */
 	reg = reg_read(DLB_CTRL_REG);
@@ -1432,10 +1569,122 @@ int ddr3_tip_configure_phy(u32 dev_num)
 		ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE,
 		DDR_PHY_DATA, 0x90, 0x6002));
 
+#if defined(CONFIG_DDR4)
+	mv_ddr4_phy_config(dev_num);
+#endif /* CONFIG_DDR4 */
 
 	return MV_OK;
 }
 
+#if defined(CONFIG_DDR4)
+/* function: ddr4TipCalibrationValidate
+ * this function validates the calibration values
+ * the function is per soc due to the different processes the calibration values are different
+ */
+MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
+{
+	MV_STATUS status = MV_OK;
+	MV_U8 if_id = 0;
+	MV_U32 read_data[MAX_INTERFACE_NUM];
+	MV_U32 cal_n = 0, cal_p = 0;
+
+	/*
+	 * Pad calibration control enable: during training set the calibration to be internal
+	 * at the end of the training it should be fixed to external to be configured by the mc6
+	 * FIXME: set the calibration to external in the end of the training
+	 */
+
+	/* pad calibration control enable */
+	CHECK_STATUS(ddr3_tip_if_write
+			(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, MAIN_PADS_CAL_MACH_CTRL_REG,
+			DYN_PADS_CAL_ENABLE_ENA << DYN_PADS_CAL_ENABLE_OFFS |
+			CAL_UPDATE_CTRL_INT << CAL_UPDATE_CTRL_OFFS,
+			DYN_PADS_CAL_ENABLE_MASK << DYN_PADS_CAL_ENABLE_OFFS |
+			CAL_UPDATE_CTRL_MASK << CAL_UPDATE_CTRL_OFFS));
+
+	/* Polling initial calibration is done*/
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id,
+				CAL_MACH_RDY << CAL_MACH_STATUS_OFFS,
+				CAL_MACH_STATUS_MASK << CAL_MACH_STATUS_OFFS,
+				MAIN_PADS_CAL_MACH_CTRL_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("ddr4TipCalibrationAdjust: DDR4 calibration poll failed(0)\n"));
+
+	/* Polling that calibration propagate to io */
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x3FFFFFF, 0x3FFFFFF, PHY_LOCK_STATUS_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("ddr4TipCalibrationAdjust: DDR4 calibration poll failed(1)\n"));
+
+	/* TODO - debug why polling not enough*/
+	mdelay(10);
+
+	/* pad calibration control disable */
+	CHECK_STATUS(ddr3_tip_if_write
+			(0, ACCESS_TYPE_MULTICAST, PARAM_NOT_CARE, MAIN_PADS_CAL_MACH_CTRL_REG,
+			DYN_PADS_CAL_ENABLE_DIS << DYN_PADS_CAL_ENABLE_OFFS |
+			CAL_UPDATE_CTRL_INT << CAL_UPDATE_CTRL_OFFS,
+			DYN_PADS_CAL_ENABLE_MASK << DYN_PADS_CAL_ENABLE_OFFS |
+			CAL_UPDATE_CTRL_MASK << CAL_UPDATE_CTRL_OFFS));
+
+	/* Polling initial calibration is done */
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id,
+				CAL_MACH_RDY << CAL_MACH_STATUS_OFFS,
+				CAL_MACH_STATUS_MASK << CAL_MACH_STATUS_OFFS,
+				MAIN_PADS_CAL_MACH_CTRL_REG, MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("ddr4TipCalibrationAdjust: DDR4 calibration poll failed(0)\n"));
+
+	/* Polling that calibration propagate to io */
+	if (ddr3_tip_if_polling(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x3FFFFFF, 0x3FFFFFF, PHY_LOCK_STATUS_REG,
+				MAX_POLLING_ITERATIONS) != MV_OK)
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR, ("ddr4TipCalibrationAdjust: DDR4 calibration poll failed(1)\n"));
+
+	/* TODO - debug why polling not enough */
+	mdelay(10);
+
+	/* Read Cal value and set to manual val */
+	CHECK_STATUS(ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x1DC8, read_data, MASK_ALL_BITS));
+	cal_n = (read_data[if_id] & ((0x3F) << 10)) >> 10;
+	cal_p = (read_data[if_id] & ((0x3F) << 4)) >> 4;
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+			  ("ddr4TipCalibrationValidate::DDR4 SSTL calib val - Pcal = 0x%x , Ncal = 0x%x\n",
+			   cal_p, cal_n));
+	if ((cal_n >= 56) || (cal_n <= 6) || (cal_p >= 59) || (cal_p <= 7)) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("%s: Error:DDR4 SSTL calib val - Pcal = 0x%x, Ncal = 0x%x are out of range\n",
+				  __func__, cal_p, cal_n));
+		status = MV_FAIL;
+	}
+
+	/* 14C8 - Vertical */
+	CHECK_STATUS(ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x14C8, read_data, MASK_ALL_BITS));
+	cal_n = (read_data[if_id] & ((0x3F) << 10)) >> 10;
+	cal_p = (read_data[if_id] & ((0x3F) << 4)) >> 4;
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+			  ("ddr4TipCalibrationValidate::DDR4 POD-V calib val - Pcal = 0x%x , Ncal = 0x%x\n",
+			  cal_p, cal_n));
+	if ((cal_n >= 56) || (cal_n <= 6) || (cal_p >= 59) || (cal_p <= 7)) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("%s: Error:DDR4 POD-V calib val - Pcal = 0x%x , Ncal= 0x%x are out of range\n",
+				  __func__, cal_p, cal_n));
+		status = MV_FAIL;
+	}
+
+	/* 17C8 - Horizontal */
+	CHECK_STATUS(ddr3_tip_if_read(dev_num, ACCESS_TYPE_UNICAST, if_id, 0x17C8, read_data, MASK_ALL_BITS));
+	cal_n = (read_data[if_id] & ((0x3F) << 10)) >> 10;
+	cal_p = (read_data[if_id] & ((0x3F) << 4)) >> 4;
+	DEBUG_TRAINING_IP(DEBUG_LEVEL_INFO,
+			  ("ddr4TipCalibrationValidate::DDR4 POD-H calib val - Pcal = 0x%x , Ncal = 0x%x\n",
+			  cal_p, cal_n));
+	if ((cal_n >= 56) || (cal_n <= 6) || (cal_p >= 59) || (cal_p <= 7)) {
+		DEBUG_TRAINING_IP(DEBUG_LEVEL_ERROR,
+				  ("%s: Error:DDR4 POD-H calib val - Pcal = 0x%x, Ncal = 0x%x are out of range\n",
+				  __func__, cal_p, cal_n));
+		status = MV_FAIL;
+	}
+
+	return status;
+}
+#endif /* CONFIG_DDR4 */
 
 int mv_ddr_manual_cal_do(void)
 {
diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.h b/drivers/ddr/marvell/a38x/mv_ddr_plat.h
index 44998847c2..01894f652c 100644
--- a/drivers/ddr/marvell/a38x/mv_ddr_plat.h
+++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.h
@@ -39,8 +39,19 @@
 #define TUNE_TRAINING_PARAMS_ODT_CONFIG_1CS	0x10000
 #define TUNE_TRAINING_PARAMS_RTT_NOM		0x44
 
+#if defined(CONFIG_DDR4)
+#define TUNE_TRAINING_PARAMS_P_ODT_DATA_DDR4	0x1A
+#define TUNE_TRAINING_PARAMS_DIC_DDR4		0x0
+#define TUNE_TRAINING_PARAMS_ODT_CONFIG_DDR4	0	/* 0x330012 */
+#define TUNE_TRAINING_PARAMS_RTT_NOM_DDR4	0	/* 0x400, RZQ/3 = 0x600 */
+#define TUNE_TRAINING_PARAMS_RTT_WR_1CS		0x200	/*RZQ/1 = 0x400*/
+#define TUNE_TRAINING_PARAMS_RTT_WR_2CS		0x200	/*RZQ/1 = 0x400*/
+#define TUNE_TRAINING_PARAMS_RTT_PARK_1CS	0
+#define TUNE_TRAINING_PARAMS_RTT_PARK_2CS	0
+#else /* CONFIG_DDR4 */
 #define TUNE_TRAINING_PARAMS_RTT_WR_1CS		0x0   /*off*/
 #define TUNE_TRAINING_PARAMS_RTT_WR_2CS		0x0   /*off*/
+#endif /* CONFIG_DDR4 */
 
 #define MARVELL_BOARD				MARVELL_BOARD_ID_BASE
 
diff --git a/drivers/ddr/marvell/a38x/mv_ddr_regs.h b/drivers/ddr/marvell/a38x/mv_ddr_regs.h
index cf2a6c92e8..a19000dbdd 100644
--- a/drivers/ddr/marvell/a38x/mv_ddr_regs.h
+++ b/drivers/ddr/marvell/a38x/mv_ddr_regs.h
@@ -373,6 +373,65 @@ enum {
 #define MRS2_CMD				0x8
 #define MRS3_CMD				0x9
 
+#if defined(CONFIG_DDR4)
+/* DDR4 MRS */
+#define MRS4_CMD				0x10
+#define MRS5_CMD				0x11
+#define MRS6_CMD				0x12
+
+/* DDR4 Registers */
+#define DDR4_MR0_REG				0x1900
+#define DDR4_MR1_REG				0x1904
+#define DDR4_MR2_REG				0x1908
+#define DDR4_MR3_REG				0x190c
+#define DDR4_MPR_PS_OFFS			0
+#define DDR4_MPR_PS_MASK			0x3
+enum mv_ddr_mpr_ps { /* DDR4 MPR Page Selection */
+	DDR4_MPR_PAGE0,
+	DDR4_MPR_PAGE1,
+	DDR4_MPR_PAGE2,
+	DDR4_MPR_PAGE3
+};
+#define DDR4_MPR_OP_OFFS			2
+#define DDR4_MPR_OP_MASK			0x1
+enum mv_ddr_mpr_op { /* DDR4 MPR Operation */
+	DDR4_MPR_OP_DIS, /* normal operation */
+	DDR4_MPR_OP_ENA  /* dataflow from mpr */
+};
+#define DDR4_MPR_RF_OFFS			11
+#define DDR4_MPR_RF_MASK			0x3
+enum mv_ddr_mpr_rd_frmt { /* DDR4 MPR Read Format */
+	DDR4_MPR_RF_SERIAL,
+	DDR4_MPR_RF_PARALLEL,
+	DDR4_MPR_RF_STAGGERED,
+	DDR4_MPR_RF_RSVD_TEMP
+
+};
+
+#define DDR4_MR4_REG				0x1910
+#define DDR4_RPT_OFFS				10
+#define DDR4_RPT_MASK				0x1
+enum { /* read preamble training mode */
+	DDR4_RPT_DIS,
+	DDR4_RPT_ENA
+};
+
+#define DDR4_MR5_REG				0x1914
+#define DDR4_MR6_REG				0x1918
+#define DDR4_MPR_WR_REG				0x19d0
+#define DDR4_MPR_LOC_OFFS			8
+#define DDR4_MPR_LOC_MASK			0x3
+/*
+ * MPR Location for MPR write and read accesses
+ * MPR Location 0..3 within the selected page (page selection in MR3 [1:0] bits)
+ */
+enum {
+	DDR4_MPR_LOC0,
+	DDR4_MPR_LOC1,
+	DDR4_MPR_LOC2,
+	DDR4_MPR_LOC3
+};
+#endif /* CONFIG_DDR4 */
 
 #define DRAM_PINS_MUX_REG			0x19d4
 #define CTRL_PINS_MUX_OFFS			0
diff --git a/drivers/ddr/marvell/a38x/mv_ddr_topology.h b/drivers/ddr/marvell/a38x/mv_ddr_topology.h
index 1cb69ad085..715c1468bc 100644
--- a/drivers/ddr/marvell/a38x/mv_ddr_topology.h
+++ b/drivers/ddr/marvell/a38x/mv_ddr_topology.h
@@ -8,6 +8,77 @@
 
 #define MAX_CS_NUM	4
 
+#if defined(CONFIG_DDR4)
+enum mv_ddr_speed_bin {
+	SPEED_BIN_DDR_1600J,
+	SPEED_BIN_DDR_1600K,
+	SPEED_BIN_DDR_1600L,
+	SPEED_BIN_DDR_1866L,
+	SPEED_BIN_DDR_1866M,
+	SPEED_BIN_DDR_1866N,
+	SPEED_BIN_DDR_2133N,
+	SPEED_BIN_DDR_2133P,
+	SPEED_BIN_DDR_2133R,
+	SPEED_BIN_DDR_2400P,
+	SPEED_BIN_DDR_2400R,
+	SPEED_BIN_DDR_2400T,
+	SPEED_BIN_DDR_2400U,
+	SPEED_BIN_DDR_2666T,
+	SPEED_BIN_DDR_2666U,
+	SPEED_BIN_DDR_2666V,
+	SPEED_BIN_DDR_2666W,
+	SPEED_BIN_DDR_2933V,
+	SPEED_BIN_DDR_2933W,
+	SPEED_BIN_DDR_2933Y,
+	SPEED_BIN_DDR_2933AA,
+	SPEED_BIN_DDR_3200W,
+	SPEED_BIN_DDR_3200AA,
+	SPEED_BIN_DDR_3200AC
+};
+
+enum mv_ddr_freq {
+	MV_DDR_FREQ_LOW_FREQ,
+	MV_DDR_FREQ_650,
+	MV_DDR_FREQ_667,
+	MV_DDR_FREQ_800,
+	MV_DDR_FREQ_933,
+	MV_DDR_FREQ_1066,
+	MV_DDR_FREQ_900,
+	MV_DDR_FREQ_1000,
+	MV_DDR_FREQ_1050,
+	MV_DDR_FREQ_1200,
+	MV_DDR_FREQ_1333,
+	MV_DDR_FREQ_1466,
+	MV_DDR_FREQ_1600,
+	MV_DDR_FREQ_LAST,
+	MV_DDR_FREQ_SAR
+};
+
+enum mv_ddr_speed_bin_timing {
+	SPEED_BIN_TRCD,
+	SPEED_BIN_TRP,
+	SPEED_BIN_TRAS,
+	SPEED_BIN_TRC,
+	SPEED_BIN_TRRD0_5K,
+	SPEED_BIN_TRRD1K,
+	SPEED_BIN_TRRD2K,
+	SPEED_BIN_TRRDL0_5K,
+	SPEED_BIN_TRRDL1K,
+	SPEED_BIN_TRRDL2K,
+	SPEED_BIN_TPD,
+	SPEED_BIN_TFAW0_5K,
+	SPEED_BIN_TFAW1K,
+	SPEED_BIN_TFAW2K,
+	SPEED_BIN_TWTR,
+	SPEED_BIN_TWTRL,
+	SPEED_BIN_TRTP,
+	SPEED_BIN_TWR,
+	SPEED_BIN_TMOD,
+	SPEED_BIN_TXPDLL,
+	SPEED_BIN_TXSDLL,
+	SPEED_BIN_TCCDL
+};
+#else /* CONFIG_DDR3 */
 enum mv_ddr_speed_bin {
 	SPEED_BIN_DDR_800D,
 	SPEED_BIN_DDR_800E,
@@ -74,6 +145,7 @@ enum mv_ddr_speed_bin_timing {
 	SPEED_BIN_TXPDLL,
 	SPEED_BIN_TXSDLL
 };
+#endif /* CONFIG_DDR4 */
 
 /* ddr bus masks */
 #define BUS_MASK_32BIT			0xf
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-17  5:34 [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository Tony Dinh
@ 2023-01-17  8:35 ` Pali Rohár
  2023-01-17 21:02   ` Tony Dinh
  0 siblings, 1 reply; 8+ messages in thread
From: Pali Rohár @ 2023-01-17  8:35 UTC (permalink / raw)
  To: Tony Dinh
  Cc: U-Boot Mailing List, Stefan Roese, Marek Beh�n,
	Chris Packham, Jaehoon Chung, Mark Kettenis, Simon Glass,
	Michael Trimarchi, Tom Rini, Marek Behún

Hello! Thank you for update. It is much better.

On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
>     This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
>     https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
> 
>     up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
>     d5acc10c287e40cc2feeb28710b92e45c93c702c
> 
>     This patch was created by following steps:
> 
>     1. Replace all a38x files in U-Boot tree by files from upstream github
>        Marvell mv-ddr-marvell repository.
> 
>     2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
> 
>         files=drivers/ddr/marvell/a38x/*
>         sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
>         unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
>             -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
>             -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
>             -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files

Do not forget to also update commit message.

And btw, commit messages has on each line some leading spaces which is
not probably intended.

>     3. Manually change license to SPDX-License-Identifier
>        (upstream license in  upstream github repository contains long license
>        texts and U-Boot is using just SPDX-License-Identifier.
> 
>     After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
>     repository and in U-Boot would be fully identical. So in future applying
>     above steps could be used to sync code again.
> 
>     The only change in this patch are:
>     - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
>     - Some fixes with include files.
>     - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
> 
>     Reference:
>     "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
>     https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
> 
> Signed-off-by: Tony Dinh <mibodhi@gmail.com>
> ---
> 
> Changes in v2:
> - Modified the filter scrip to explicitly include ARMADA_38X code
> and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
> drivers/ddr/marvell/a38x/
> - Updated script
> files=drivers/ddr/marvell/a38x/*
> sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files

You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
removed and CONFIG_ARMADA_38X already handled by unifdef.

> unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
>                 -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
>                 -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
>                 -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
>                                 -UCONFIG_64BIT $files
> - Remove more dead code files
> - Correct SPDX license header
> 
>  drivers/ddr/marvell/a38x/Makefile             |    8 +
>  drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
>  drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
>  drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
>  drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
>  drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
>  drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
>  .../a38x/ddr3_training_centralization.c       |    4 +
>  drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
>  drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
>  .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
>  .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
>  .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
>  .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
>  drivers/ddr/marvell/a38x/dram_if.h            |   13 -
>  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
>  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
>  drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
>  drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
>  .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
>  .../a38x/mv_ddr4_training_calibration.h       |   26 +
>  .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
>  .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
>  .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
>  drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
>  drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
>  drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
>  drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
>  28 files changed, 5996 insertions(+), 13 deletions(-)
>  delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h

I see that you are removing some existing file. If it is not needed
neither for DDR3 nor for DDR4 then please remove it in separate commit
or patch. So we do not mix different things into one commit.

>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
>  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
...
> diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> index 7c7bce73a3..16d177b42f 100644
> --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> @@ -12,6 +12,11 @@
>  #define DDR_INTERFACES_NUM		1
>  #define DDR_INTERFACE_OCTETS_NUM	5
>  
> +/* These were defined in ATF area that was stripped out */
> +#define MV_STATUS	int
> +#define MV_U32		u32
> +#define MV_U8		u8
> +

This is something new which you added? Because I do not see it in
Marvell code.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-17  8:35 ` Pali Rohár
@ 2023-01-17 21:02   ` Tony Dinh
  2023-01-17 21:25     ` Pali Rohár
  2023-01-18 11:01     ` Stefan Roese
  0 siblings, 2 replies; 8+ messages in thread
From: Tony Dinh @ 2023-01-17 21:02 UTC (permalink / raw)
  To: Pali Rohár
  Cc: U-Boot Mailing List, Stefan Roese, Marek Beh�n,
	Chris Packham, Jaehoon Chung, Mark Kettenis, Simon Glass,
	Michael Trimarchi, Tom Rini, Marek Behún

Hi Pali,

On Tue, Jan 17, 2023 at 12:35 AM Pali Rohár <pali@kernel.org> wrote:
>
> Hello! Thank you for update. It is much better.
>
> On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
> >     This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
> >     https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
> >
> >     up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
> >     d5acc10c287e40cc2feeb28710b92e45c93c702c
> >
> >     This patch was created by following steps:
> >
> >     1. Replace all a38x files in U-Boot tree by files from upstream github
> >        Marvell mv-ddr-marvell repository.
> >
> >     2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
> >
> >         files=drivers/ddr/marvell/a38x/*
> >         sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> >         unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> >             -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> >             -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> >             -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files
>
> Do not forget to also update commit message.

Yes, patman extracts and creates the patch description from the commit.

>
> And btw, commit messages has on each line some leading spaces which is
> not probably intended.

That was intentional to make the commit description (and patch
description) more readable. Is it not recommended?

>
> >     3. Manually change license to SPDX-License-Identifier
> >        (upstream license in  upstream github repository contains long license
> >        texts and U-Boot is using just SPDX-License-Identifier.
> >
> >     After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
> >     repository and in U-Boot would be fully identical. So in future applying
> >     above steps could be used to sync code again.
> >
> >     The only change in this patch are:
> >     - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
> >     - Some fixes with include files.
> >     - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
> >
> >     Reference:
> >     "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
> >     https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
> >
> > Signed-off-by: Tony Dinh <mibodhi@gmail.com>
> > ---
> >
> > Changes in v2:
> > - Modified the filter scrip to explicitly include ARMADA_38X code
> > and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
> > drivers/ddr/marvell/a38x/
> > - Updated script
> > files=drivers/ddr/marvell/a38x/*
> > sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
>
> You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
> removed and CONFIG_ARMADA_38X already handled by unifdef.

Thanks, I was not sure if unifdef works in that "OR" condition. I will
update the commit message.

>
> > unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> >                 -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> >                 -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> >                 -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
> >                                 -UCONFIG_64BIT $files
> > - Remove more dead code files
> > - Correct SPDX license header
> >
> >  drivers/ddr/marvell/a38x/Makefile             |    8 +
> >  drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
> >  drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
> >  drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
> >  drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
> >  drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
> >  drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
> >  .../a38x/ddr3_training_centralization.c       |    4 +
> >  drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
> >  drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
> >  .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
> >  .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
> >  .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
> >  .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
> >  drivers/ddr/marvell/a38x/dram_if.h            |   13 -
> >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
> >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
> >  drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
> >  drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
> >  .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
> >  .../a38x/mv_ddr4_training_calibration.h       |   26 +
> >  .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
> >  .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
> >  .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
> >  drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
> >  drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
> >  drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
> >  drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
> >  28 files changed, 5996 insertions(+), 13 deletions(-)
> >  delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
>
> I see that you are removing some existing file. If it is not needed
> neither for DDR3 nor for DDR4 then please remove it in separate commit
> or patch. So we do not mix different things into one commit.

Instead of making a different commit, will it work if we list the
files being removed in this commit message? It is part of removing
dead code.

>
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
> >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
> ...
> > diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > index 7c7bce73a3..16d177b42f 100644
> > --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > @@ -12,6 +12,11 @@
> >  #define DDR_INTERFACES_NUM           1
> >  #define DDR_INTERFACE_OCTETS_NUM     5
> >
> > +/* These were defined in ATF area that was stripped out */
> > +#define MV_STATUS    int
> > +#define MV_U32               u32
> > +#define MV_U8                u8
> > +
>
> This is something new which you added? Because I do not see it in
> Marvell code.

Yes, those were in the original code after the initial copying from
the Marvell repo.

# grep -E '(MV_U32|MV_STATUS|MV_U8)' *.[ch] a38x/*.[ch]

ddr_init.c:MV_U32 ddr_init(void)
mv_ddr_atf_wrapper.h:#define MV_STATUS int
mv_ddr_atf_wrapper.h:#define MV_U8 u8
mv_ddr_atf_wrapper.h:#define MV_U32 u32
a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
a38x/mv_ddr_plat.c: MV_STATUS status = MV_OK;
a38x/mv_ddr_plat.c: MV_U8 if_id = 0;
a38x/mv_ddr_plat.c: MV_U32 read_data[MAX_INTERFACE_NUM];
a38x/mv_ddr_plat.c: MV_U32 cal_n = 0, cal_p = 0;

Those 3 are defined in mv_ddr_atf_wrapper.h and used in mv_ddr_plat.c
(after we ran the filter script, this file is the only place that
needs those 3 defines). Since we removed the ATF code, we need to
define them here in mv_ddr_plat.c. Is this OK or do you have any
suggestions for a better approach?

Thanks,
Tony

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-17 21:02   ` Tony Dinh
@ 2023-01-17 21:25     ` Pali Rohár
  2023-01-18  1:18       ` Tony Dinh
  2023-01-18 11:01     ` Stefan Roese
  1 sibling, 1 reply; 8+ messages in thread
From: Pali Rohár @ 2023-01-17 21:25 UTC (permalink / raw)
  To: Tony Dinh
  Cc: U-Boot Mailing List, Stefan Roese, Marek Beh�n,
	Chris Packham, Jaehoon Chung, Mark Kettenis, Simon Glass,
	Michael Trimarchi, Tom Rini, Marek Behún

Hello!

On Tuesday 17 January 2023 13:02:46 Tony Dinh wrote:
> Hi Pali,
> 
> On Tue, Jan 17, 2023 at 12:35 AM Pali Rohár <pali@kernel.org> wrote:
> >
> > Hello! Thank you for update. It is much better.
> >
> > On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
> > >     This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
> > >     https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
> > >
> > >     up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
> > >     d5acc10c287e40cc2feeb28710b92e45c93c702c
> > >
> > >     This patch was created by following steps:
> > >
> > >     1. Replace all a38x files in U-Boot tree by files from upstream github
> > >        Marvell mv-ddr-marvell repository.
> > >
> > >     2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
> > >
> > >         files=drivers/ddr/marvell/a38x/*
> > >         sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > >         unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > >             -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > >             -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > >             -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files
> >
> > Do not forget to also update commit message.
> 
> Yes, patman extracts and creates the patch description from the commit.

My reaction was just because you forgot to update undef for a39x.

> >
> > And btw, commit messages has on each line some leading spaces which is
> > not probably intended.
> 
> That was intentional to make the commit description (and patch
> description) more readable. Is it not recommended?

I'm not sure if we are talking about the same thing. When I read this
your patch I saw that every time, even the first one "This sync drivers/..."
has 4 spaces before word "This". And I'm not sure if this is just my
email client or not and there is some reason for it. Look at indentation
of line "Signed-off-by:" and line "Reference:". Should not be those two
lines at same indentation level? Or I did not understand it? :D

I agree that adding indentation inside of 1. 2. 3. parts is fully
recommended as it makes text more readable.

> >
> > >     3. Manually change license to SPDX-License-Identifier
> > >        (upstream license in  upstream github repository contains long license
> > >        texts and U-Boot is using just SPDX-License-Identifier.
> > >
> > >     After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
> > >     repository and in U-Boot would be fully identical. So in future applying
> > >     above steps could be used to sync code again.
> > >
> > >     The only change in this patch are:
> > >     - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
> > >     - Some fixes with include files.
> > >     - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
> > >
> > >     Reference:
> > >     "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
> > >     https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
> > >
> > > Signed-off-by: Tony Dinh <mibodhi@gmail.com>
> > > ---
> > >
> > > Changes in v2:
> > > - Modified the filter scrip to explicitly include ARMADA_38X code
> > > and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
> > > drivers/ddr/marvell/a38x/
> > > - Updated script
> > > files=drivers/ddr/marvell/a38x/*
> > > sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> >
> > You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
> > removed and CONFIG_ARMADA_38X already handled by unifdef.
> 
> Thanks, I was not sure if unifdef works in that "OR" condition. I will
> update the commit message.

It should work if at least one of the option in OR condition is
specified with -D on command line. But if you are unsure then it is
better to test it (should be quite easy and fast).

> >
> > > unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > >                 -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > >                 -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > >                 -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
> > >                                 -UCONFIG_64BIT $files
> > > - Remove more dead code files
> > > - Correct SPDX license header
> > >
> > >  drivers/ddr/marvell/a38x/Makefile             |    8 +
> > >  drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
> > >  drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
> > >  drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
> > >  drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
> > >  drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
> > >  drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
> > >  .../a38x/ddr3_training_centralization.c       |    4 +
> > >  drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
> > >  drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
> > >  .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
> > >  .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
> > >  .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
> > >  .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
> > >  drivers/ddr/marvell/a38x/dram_if.h            |   13 -
> > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
> > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
> > >  drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
> > >  drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
> > >  .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
> > >  .../a38x/mv_ddr4_training_calibration.h       |   26 +
> > >  .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
> > >  .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
> > >  .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
> > >  drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
> > >  drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
> > >  drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
> > >  drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
> > >  28 files changed, 5996 insertions(+), 13 deletions(-)
> > >  delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
> >
> > I see that you are removing some existing file. If it is not needed
> > neither for DDR3 nor for DDR4 then please remove it in separate commit
> > or patch. So we do not mix different things into one commit.
> 
> Instead of making a different commit, will it work if we list the
> files being removed in this commit message? It is part of removing
> dead code.

I do not know. I'm always trying to put different thing into different
commits. Reason is that if in some case it would be needed to revert
commit then unrelated cleanup does not need to be reverted :-)

> >
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
> > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
> > ...
> > > diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > index 7c7bce73a3..16d177b42f 100644
> > > --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > @@ -12,6 +12,11 @@
> > >  #define DDR_INTERFACES_NUM           1
> > >  #define DDR_INTERFACE_OCTETS_NUM     5
> > >
> > > +/* These were defined in ATF area that was stripped out */
> > > +#define MV_STATUS    int
> > > +#define MV_U32               u32
> > > +#define MV_U8                u8
> > > +
> >
> > This is something new which you added? Because I do not see it in
> > Marvell code.
> 
> Yes, those were in the original code after the initial copying from
> the Marvell repo.
> 
> # grep -E '(MV_U32|MV_STATUS|MV_U8)' *.[ch] a38x/*.[ch]
> 
> ddr_init.c:MV_U32 ddr_init(void)
> mv_ddr_atf_wrapper.h:#define MV_STATUS int
> mv_ddr_atf_wrapper.h:#define MV_U8 u8
> mv_ddr_atf_wrapper.h:#define MV_U32 u32
> a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> a38x/mv_ddr_plat.c: MV_STATUS status = MV_OK;
> a38x/mv_ddr_plat.c: MV_U8 if_id = 0;
> a38x/mv_ddr_plat.c: MV_U32 read_data[MAX_INTERFACE_NUM];
> a38x/mv_ddr_plat.c: MV_U32 cal_n = 0, cal_p = 0;
> 
> Those 3 are defined in mv_ddr_atf_wrapper.h and used in mv_ddr_plat.c
> (after we ran the filter script, this file is the only place that
> needs those 3 defines). Since we removed the ATF code, we need to
> define them here in mv_ddr_plat.c. Is this OK or do you have any
> suggestions for a better approach?

Hmm... You found another bug in Marvell code:

$ git grep mv_ddr4_calibration_validate
a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
apn806/mv_ddr_plat.c:int mv_ddr4_calibration_validate(u32 dev_num)
mv_ddr4_training.c:             return mv_ddr4_calibration_validate(dev_num);
mv_ddr4_training.h:int mv_ddr4_calibration_validate(u32 dev_num);

That function mv_ddr4_calibration_validate() should return int type (as
defined in header file) and not MV_STATUS type. So rather fix return
type of the function to match what is in header file. Also there is
mismatch with its argument u32 vs MV_u32!

Next, MV_U8 is used only at one place (a37xx defines it moreover locally):

$ git grep MV_U8
a3700/mv_ddr_a3700_wrapper.h:#define MV_U8              u8
a38x/mv_ddr_plat.c:     MV_U8 if_id = 0;
mv_ddr_atf_wrapper.h:#define MV_U8              u8

So replace MV_U8 directly by u8. And same for MV_U32 for a38x code.

And ideally, send a pull request to Marvell repo with these fixes (and
also with floating point), so code can be synced easily also again in
future.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-17 21:25     ` Pali Rohár
@ 2023-01-18  1:18       ` Tony Dinh
  2023-01-18 18:30         ` Pali Rohár
  0 siblings, 1 reply; 8+ messages in thread
From: Tony Dinh @ 2023-01-18  1:18 UTC (permalink / raw)
  To: Pali Rohár
  Cc: U-Boot Mailing List, Stefan Roese, Marek Beh�n,
	Chris Packham, Jaehoon Chung, Mark Kettenis, Simon Glass,
	Michael Trimarchi, Tom Rini, Marek Behún

Hi Pali,

On Tue, Jan 17, 2023 at 1:25 PM Pali Rohár <pali@kernel.org> wrote:
>
> Hello!
>
> On Tuesday 17 January 2023 13:02:46 Tony Dinh wrote:
> > Hi Pali,
> >
> > On Tue, Jan 17, 2023 at 12:35 AM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > Hello! Thank you for update. It is much better.
> > >
> > > On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
> > > >     This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
> > > >     https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
> > > >
> > > >     up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
> > > >     d5acc10c287e40cc2feeb28710b92e45c93c702c
> > > >
> > > >     This patch was created by following steps:
> > > >
> > > >     1. Replace all a38x files in U-Boot tree by files from upstream github
> > > >        Marvell mv-ddr-marvell repository.
> > > >
> > > >     2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
> > > >
> > > >         files=drivers/ddr/marvell/a38x/*
> > > >         sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > > >         unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > > >             -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > > >             -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > > >             -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files
> > >
> > > Do not forget to also update commit message.
> >
> > Yes, patman extracts and creates the patch description from the commit.
>
> My reaction was just because you forgot to update undef for a39x.

I see. Yes, I did miss that! I only described the changes in v2 description.

> > >
> > > And btw, commit messages has on each line some leading spaces which is
> > > not probably intended.
> >
> > That was intentional to make the commit description (and patch
> > description) more readable. Is it not recommended?
>
> I'm not sure if we are talking about the same thing. When I read this
> your patch I saw that every time, even the first one "This sync drivers/..."
> has 4 spaces before word "This". And I'm not sure if this is just my
> email client or not and there is some reason for it. Look at indentation
> of line "Signed-off-by:" and line "Reference:". Should not be those two
> lines at same indentation level? Or I did not understand it? :D
>
> I agree that adding indentation inside of 1. 2. 3. parts is fully
> recommended as it makes text more readable.

When I do git log I also see an extra 4 or 8 spaces on each line:) so
not sure what we are seeing here. But yes it seems some of the
indentation is inconsistent. Will fix that.

>
> > >
> > > >     3. Manually change license to SPDX-License-Identifier
> > > >        (upstream license in  upstream github repository contains long license
> > > >        texts and U-Boot is using just SPDX-License-Identifier.
> > > >
> > > >     After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
> > > >     repository and in U-Boot would be fully identical. So in future applying
> > > >     above steps could be used to sync code again.
> > > >
> > > >     The only change in this patch are:
> > > >     - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
> > > >     - Some fixes with include files.
> > > >     - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
> > > >
> > > >     Reference:
> > > >     "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
> > > >     https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
> > > >
> > > > Signed-off-by: Tony Dinh <mibodhi@gmail.com>
> > > > ---
> > > >
> > > > Changes in v2:
> > > > - Modified the filter scrip to explicitly include ARMADA_38X code
> > > > and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
> > > > drivers/ddr/marvell/a38x/
> > > > - Updated script
> > > > files=drivers/ddr/marvell/a38x/*
> > > > sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > >
> > > You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
> > > removed and CONFIG_ARMADA_38X already handled by unifdef.
> >
> > Thanks, I was not sure if unifdef works in that "OR" condition. I will
> > update the commit message.
>
> It should work if at least one of the option in OR condition is
> specified with -D on command line. But if you are unsure then it is
> better to test it (should be quite easy and fast).
>
> > >
> > > > unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > > >                 -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > > >                 -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > > >                 -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
> > > >                                 -UCONFIG_64BIT $files
> > > > - Remove more dead code files
> > > > - Correct SPDX license header
> > > >
> > > >  drivers/ddr/marvell/a38x/Makefile             |    8 +
> > > >  drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
> > > >  drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
> > > >  drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
> > > >  drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
> > > >  drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
> > > >  drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
> > > >  .../a38x/ddr3_training_centralization.c       |    4 +
> > > >  drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
> > > >  drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
> > > >  .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
> > > >  .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
> > > >  .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
> > > >  .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
> > > >  drivers/ddr/marvell/a38x/dram_if.h            |   13 -
> > > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
> > > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
> > > >  drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
> > > >  drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
> > > >  .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
> > > >  .../a38x/mv_ddr4_training_calibration.h       |   26 +
> > > >  .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
> > > >  .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
> > > >  .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
> > > >  drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
> > > >  drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
> > > >  drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
> > > >  drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
> > > >  28 files changed, 5996 insertions(+), 13 deletions(-)
> > > >  delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
> > >
> > > I see that you are removing some existing file. If it is not needed
> > > neither for DDR3 nor for DDR4 then please remove it in separate commit
> > > or patch. So we do not mix different things into one commit.
> >
> > Instead of making a different commit, will it work if we list the
> > files being removed in this commit message? It is part of removing
> > dead code.
>
> I do not know. I'm always trying to put different thing into different
> commits. Reason is that if in some case it would be needed to revert
> commit then unrelated cleanup does not need to be reverted :-)

There is only one existing file removed (dram_if.h). The other are new
files that become dead code after we run the filter script. Do you
think those should also be kept in this patch, and then we'll have a
cleanup patch later? The alternative is I can just list the
new-but-deleted files in the commit description so they can be kept
tracked of (easier for the future code sync if we know what should be
candidates for removal).

>
> > >
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
> > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
> > > ...
> > > > diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > index 7c7bce73a3..16d177b42f 100644
> > > > --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > @@ -12,6 +12,11 @@
> > > >  #define DDR_INTERFACES_NUM           1
> > > >  #define DDR_INTERFACE_OCTETS_NUM     5
> > > >
> > > > +/* These were defined in ATF area that was stripped out */
> > > > +#define MV_STATUS    int
> > > > +#define MV_U32               u32
> > > > +#define MV_U8                u8
> > > > +
> > >
> > > This is something new which you added? Because I do not see it in
> > > Marvell code.
> >
> > Yes, those were in the original code after the initial copying from
> > the Marvell repo.
> >
> > # grep -E '(MV_U32|MV_STATUS|MV_U8)' *.[ch] a38x/*.[ch]
> >
> > ddr_init.c:MV_U32 ddr_init(void)
> > mv_ddr_atf_wrapper.h:#define MV_STATUS int
> > mv_ddr_atf_wrapper.h:#define MV_U8 u8
> > mv_ddr_atf_wrapper.h:#define MV_U32 u32
> > a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> > a38x/mv_ddr_plat.c: MV_STATUS status = MV_OK;
> > a38x/mv_ddr_plat.c: MV_U8 if_id = 0;
> > a38x/mv_ddr_plat.c: MV_U32 read_data[MAX_INTERFACE_NUM];
> > a38x/mv_ddr_plat.c: MV_U32 cal_n = 0, cal_p = 0;
> >
> > Those 3 are defined in mv_ddr_atf_wrapper.h and used in mv_ddr_plat.c
> > (after we ran the filter script, this file is the only place that
> > needs those 3 defines). Since we removed the ATF code, we need to
> > define them here in mv_ddr_plat.c. Is this OK or do you have any
> > suggestions for a better approach?
>
> Hmm... You found another bug in Marvell code:
>
> $ git grep mv_ddr4_calibration_validate
> a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> apn806/mv_ddr_plat.c:int mv_ddr4_calibration_validate(u32 dev_num)
> mv_ddr4_training.c:             return mv_ddr4_calibration_validate(dev_num);
> mv_ddr4_training.h:int mv_ddr4_calibration_validate(u32 dev_num);
>
> That function mv_ddr4_calibration_validate() should return int type (as
> defined in header file) and not MV_STATUS type. So rather fix return
> type of the function to match what is in header file. Also there is
> mismatch with its argument u32 vs MV_u32!
>
> Next, MV_U8 is used only at one place (a37xx defines it moreover locally):
>
> $ git grep MV_U8
> a3700/mv_ddr_a3700_wrapper.h:#define MV_U8              u8
> a38x/mv_ddr_plat.c:     MV_U8 if_id = 0;
> mv_ddr_atf_wrapper.h:#define MV_U8              u8
>
> So replace MV_U8 directly by u8. And same for MV_U32 for a38x code.

Cool.

> And ideally, send a pull request to Marvell repo with these fixes (and
> also with floating point), so code can be synced easily also again in
> future.

Would you do that when you have time after DDR4 gets merged?

Thanks,
Tony

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-17 21:02   ` Tony Dinh
  2023-01-17 21:25     ` Pali Rohár
@ 2023-01-18 11:01     ` Stefan Roese
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Roese @ 2023-01-18 11:01 UTC (permalink / raw)
  To: Tony Dinh, Pali Rohár
  Cc: U-Boot Mailing List, Marek Beh�n, Chris Packham,
	Jaehoon Chung, Mark Kettenis, Simon Glass, Michael Trimarchi,
	Tom Rini, Marek Behún

Hi Tony,

On 1/17/23 22:02, Tony Dinh wrote:
> Hi Pali,
> 
> On Tue, Jan 17, 2023 at 12:35 AM Pali Rohár <pali@kernel.org> wrote:
>>
>> Hello! Thank you for update. It is much better.
>>
>> On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
>>>      This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
>>>      https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
>>>
>>>      up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
>>>      d5acc10c287e40cc2feeb28710b92e45c93c702c
>>>
>>>      This patch was created by following steps:
>>>
>>>      1. Replace all a38x files in U-Boot tree by files from upstream github
>>>         Marvell mv-ddr-marvell repository.
>>>
>>>      2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
>>>
>>>          files=drivers/ddr/marvell/a38x/*
>>>          sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
>>>          unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
>>>              -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
>>>              -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
>>>              -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files
>>
>> Do not forget to also update commit message.
> 
> Yes, patman extracts and creates the patch description from the commit.
> 
>>
>> And btw, commit messages has on each line some leading spaces which is
>> not probably intended.
> 
> That was intentional to make the commit description (and patch
> description) more readable. Is it not recommended?

Readability is just fine IMHO.

>>
>>>      3. Manually change license to SPDX-License-Identifier
>>>         (upstream license in  upstream github repository contains long license
>>>         texts and U-Boot is using just SPDX-License-Identifier.
>>>
>>>      After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
>>>      repository and in U-Boot would be fully identical. So in future applying
>>>      above steps could be used to sync code again.
>>>
>>>      The only change in this patch are:
>>>      - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
>>>      - Some fixes with include files.
>>>      - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
>>>
>>>      Reference:
>>>      "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
>>>      https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
>>>
>>> Signed-off-by: Tony Dinh <mibodhi@gmail.com>
>>> ---
>>>
>>> Changes in v2:
>>> - Modified the filter scrip to explicitly include ARMADA_38X code
>>> and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
>>> drivers/ddr/marvell/a38x/
>>> - Updated script
>>> files=drivers/ddr/marvell/a38x/*
>>> sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
>>
>> You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
>> removed and CONFIG_ARMADA_38X already handled by unifdef.
> 
> Thanks, I was not sure if unifdef works in that "OR" condition. I will
> update the commit message.
> 
>>
>>> unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
>>>                  -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
>>>                  -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
>>>                  -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
>>>                                  -UCONFIG_64BIT $files
>>> - Remove more dead code files
>>> - Correct SPDX license header
>>>
>>>   drivers/ddr/marvell/a38x/Makefile             |    8 +
>>>   drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
>>>   drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
>>>   drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
>>>   drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
>>>   drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
>>>   drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
>>>   .../a38x/ddr3_training_centralization.c       |    4 +
>>>   drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
>>>   drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
>>>   .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
>>>   .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
>>>   .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
>>>   .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
>>>   drivers/ddr/marvell/a38x/dram_if.h            |   13 -
>>>   drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
>>>   drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
>>>   drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
>>>   drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
>>>   .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
>>>   .../a38x/mv_ddr4_training_calibration.h       |   26 +
>>>   .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
>>>   .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
>>>   .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
>>>   drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
>>>   drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
>>>   drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
>>>   drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
>>>   28 files changed, 5996 insertions(+), 13 deletions(-)
>>>   delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
>>
>> I see that you are removing some existing file. If it is not needed
>> neither for DDR3 nor for DDR4 then please remove it in separate commit
>> or patch. So we do not mix different things into one commit.
> 
> Instead of making a different commit, will it work if we list the
> files being removed in this commit message? It is part of removing
> dead code.

I agree with Pali on this. Please keep logically unrelated changes in
separate commits if possible.

Thanks,
Stefan

>>
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
>>>   create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
>> ...
>>> diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
>>> index 7c7bce73a3..16d177b42f 100644
>>> --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
>>> +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
>>> @@ -12,6 +12,11 @@
>>>   #define DDR_INTERFACES_NUM           1
>>>   #define DDR_INTERFACE_OCTETS_NUM     5
>>>
>>> +/* These were defined in ATF area that was stripped out */
>>> +#define MV_STATUS    int
>>> +#define MV_U32               u32
>>> +#define MV_U8                u8
>>> +
>>
>> This is something new which you added? Because I do not see it in
>> Marvell code.
> 
> Yes, those were in the original code after the initial copying from
> the Marvell repo.
> 
> # grep -E '(MV_U32|MV_STATUS|MV_U8)' *.[ch] a38x/*.[ch]
> 
> ddr_init.c:MV_U32 ddr_init(void)
> mv_ddr_atf_wrapper.h:#define MV_STATUS int
> mv_ddr_atf_wrapper.h:#define MV_U8 u8
> mv_ddr_atf_wrapper.h:#define MV_U32 u32
> a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> a38x/mv_ddr_plat.c: MV_STATUS status = MV_OK;
> a38x/mv_ddr_plat.c: MV_U8 if_id = 0;
> a38x/mv_ddr_plat.c: MV_U32 read_data[MAX_INTERFACE_NUM];
> a38x/mv_ddr_plat.c: MV_U32 cal_n = 0, cal_p = 0;
> 
> Those 3 are defined in mv_ddr_atf_wrapper.h and used in mv_ddr_plat.c
> (after we ran the filter script, this file is the only place that
> needs those 3 defines). Since we removed the ATF code, we need to
> define them here in mv_ddr_plat.c. Is this OK or do you have any
> suggestions for a better approach?
> 
> Thanks,
> Tony

Viele Grüße,
Stefan Roese

-- 
DENX Software Engineering GmbH,      Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-18  1:18       ` Tony Dinh
@ 2023-01-18 18:30         ` Pali Rohár
  2023-01-18 21:13           ` Tony Dinh
  0 siblings, 1 reply; 8+ messages in thread
From: Pali Rohár @ 2023-01-18 18:30 UTC (permalink / raw)
  To: Tony Dinh
  Cc: U-Boot Mailing List, Stefan Roese, Marek Beh�n,
	Chris Packham, Jaehoon Chung, Mark Kettenis, Simon Glass,
	Michael Trimarchi, Tom Rini, Marek Behún

On Tuesday 17 January 2023 17:18:43 Tony Dinh wrote:
> Hi Pali,
> 
> On Tue, Jan 17, 2023 at 1:25 PM Pali Rohár <pali@kernel.org> wrote:
> >
> > Hello!
> >
> > On Tuesday 17 January 2023 13:02:46 Tony Dinh wrote:
> > > Hi Pali,
> > >
> > > On Tue, Jan 17, 2023 at 12:35 AM Pali Rohár <pali@kernel.org> wrote:
> > > >
> > > > Hello! Thank you for update. It is much better.
> > > >
> > > > On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
> > > > >     This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
> > > > >     https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
> > > > >
> > > > >     up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
> > > > >     d5acc10c287e40cc2feeb28710b92e45c93c702c
> > > > >
> > > > >     This patch was created by following steps:
> > > > >
> > > > >     1. Replace all a38x files in U-Boot tree by files from upstream github
> > > > >        Marvell mv-ddr-marvell repository.
> > > > >
> > > > >     2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
> > > > >
> > > > >         files=drivers/ddr/marvell/a38x/*
> > > > >         sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > > > >         unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > > > >             -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > > > >             -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > > > >             -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files
> > > >
> > > > Do not forget to also update commit message.
> > >
> > > Yes, patman extracts and creates the patch description from the commit.
> >
> > My reaction was just because you forgot to update undef for a39x.
> 
> I see. Yes, I did miss that! I only described the changes in v2 description.
> 
> > > >
> > > > And btw, commit messages has on each line some leading spaces which is
> > > > not probably intended.
> > >
> > > That was intentional to make the commit description (and patch
> > > description) more readable. Is it not recommended?
> >
> > I'm not sure if we are talking about the same thing. When I read this
> > your patch I saw that every time, even the first one "This sync drivers/..."
> > has 4 spaces before word "This". And I'm not sure if this is just my
> > email client or not and there is some reason for it. Look at indentation
> > of line "Signed-off-by:" and line "Reference:". Should not be those two
> > lines at same indentation level? Or I did not understand it? :D
> >
> > I agree that adding indentation inside of 1. 2. 3. parts is fully
> > recommended as it makes text more readable.
> 
> When I do git log I also see an extra 4 or 8 spaces on each line:) so
> not sure what we are seeing here. But yes it seems some of the
> indentation is inconsistent. Will fix that.
> 
> >
> > > >
> > > > >     3. Manually change license to SPDX-License-Identifier
> > > > >        (upstream license in  upstream github repository contains long license
> > > > >        texts and U-Boot is using just SPDX-License-Identifier.
> > > > >
> > > > >     After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
> > > > >     repository and in U-Boot would be fully identical. So in future applying
> > > > >     above steps could be used to sync code again.
> > > > >
> > > > >     The only change in this patch are:
> > > > >     - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
> > > > >     - Some fixes with include files.
> > > > >     - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
> > > > >
> > > > >     Reference:
> > > > >     "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
> > > > >     https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
> > > > >
> > > > > Signed-off-by: Tony Dinh <mibodhi@gmail.com>
> > > > > ---
> > > > >
> > > > > Changes in v2:
> > > > > - Modified the filter scrip to explicitly include ARMADA_38X code
> > > > > and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
> > > > > drivers/ddr/marvell/a38x/
> > > > > - Updated script
> > > > > files=drivers/ddr/marvell/a38x/*
> > > > > sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > > >
> > > > You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
> > > > removed and CONFIG_ARMADA_38X already handled by unifdef.
> > >
> > > Thanks, I was not sure if unifdef works in that "OR" condition. I will
> > > update the commit message.
> >
> > It should work if at least one of the option in OR condition is
> > specified with -D on command line. But if you are unsure then it is
> > better to test it (should be quite easy and fast).
> >
> > > >
> > > > > unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > > > >                 -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > > > >                 -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > > > >                 -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
> > > > >                                 -UCONFIG_64BIT $files
> > > > > - Remove more dead code files
> > > > > - Correct SPDX license header
> > > > >
> > > > >  drivers/ddr/marvell/a38x/Makefile             |    8 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
> > > > >  .../a38x/ddr3_training_centralization.c       |    4 +
> > > > >  drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
> > > > >  drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
> > > > >  .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
> > > > >  .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
> > > > >  .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
> > > > >  .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
> > > > >  drivers/ddr/marvell/a38x/dram_if.h            |   13 -
> > > > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
> > > > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
> > > > >  drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
> > > > >  drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
> > > > >  .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
> > > > >  .../a38x/mv_ddr4_training_calibration.h       |   26 +
> > > > >  .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
> > > > >  .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
> > > > >  .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
> > > > >  drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
> > > > >  drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
> > > > >  drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
> > > > >  drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
> > > > >  28 files changed, 5996 insertions(+), 13 deletions(-)
> > > > >  delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
> > > >
> > > > I see that you are removing some existing file. If it is not needed
> > > > neither for DDR3 nor for DDR4 then please remove it in separate commit
> > > > or patch. So we do not mix different things into one commit.
> > >
> > > Instead of making a different commit, will it work if we list the
> > > files being removed in this commit message? It is part of removing
> > > dead code.
> >
> > I do not know. I'm always trying to put different thing into different
> > commits. Reason is that if in some case it would be needed to revert
> > commit then unrelated cleanup does not need to be reverted :-)
> 
> There is only one existing file removed (dram_if.h).

It looks like that this file is already not needed. So in separate patch
this file can be removed from u-boot repository.

> The other are new
> files that become dead code after we run the filter script. Do you
> think those should also be kept in this patch, and then we'll have a
> cleanup patch later? The alternative is I can just list the
> new-but-deleted files in the commit description so they can be kept
> tracked of (easier for the future code sync if we know what should be
> candidates for removal).
> 
> >
> > > >
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
> > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
> > > > ...
> > > > > diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > > index 7c7bce73a3..16d177b42f 100644
> > > > > --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > > +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > > @@ -12,6 +12,11 @@
> > > > >  #define DDR_INTERFACES_NUM           1
> > > > >  #define DDR_INTERFACE_OCTETS_NUM     5
> > > > >
> > > > > +/* These were defined in ATF area that was stripped out */
> > > > > +#define MV_STATUS    int
> > > > > +#define MV_U32               u32
> > > > > +#define MV_U8                u8
> > > > > +
> > > >
> > > > This is something new which you added? Because I do not see it in
> > > > Marvell code.
> > >
> > > Yes, those were in the original code after the initial copying from
> > > the Marvell repo.
> > >
> > > # grep -E '(MV_U32|MV_STATUS|MV_U8)' *.[ch] a38x/*.[ch]
> > >
> > > ddr_init.c:MV_U32 ddr_init(void)
> > > mv_ddr_atf_wrapper.h:#define MV_STATUS int
> > > mv_ddr_atf_wrapper.h:#define MV_U8 u8
> > > mv_ddr_atf_wrapper.h:#define MV_U32 u32
> > > a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> > > a38x/mv_ddr_plat.c: MV_STATUS status = MV_OK;
> > > a38x/mv_ddr_plat.c: MV_U8 if_id = 0;
> > > a38x/mv_ddr_plat.c: MV_U32 read_data[MAX_INTERFACE_NUM];
> > > a38x/mv_ddr_plat.c: MV_U32 cal_n = 0, cal_p = 0;
> > >
> > > Those 3 are defined in mv_ddr_atf_wrapper.h and used in mv_ddr_plat.c
> > > (after we ran the filter script, this file is the only place that
> > > needs those 3 defines). Since we removed the ATF code, we need to
> > > define them here in mv_ddr_plat.c. Is this OK or do you have any
> > > suggestions for a better approach?
> >
> > Hmm... You found another bug in Marvell code:
> >
> > $ git grep mv_ddr4_calibration_validate
> > a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> > apn806/mv_ddr_plat.c:int mv_ddr4_calibration_validate(u32 dev_num)
> > mv_ddr4_training.c:             return mv_ddr4_calibration_validate(dev_num);
> > mv_ddr4_training.h:int mv_ddr4_calibration_validate(u32 dev_num);
> >
> > That function mv_ddr4_calibration_validate() should return int type (as
> > defined in header file) and not MV_STATUS type. So rather fix return
> > type of the function to match what is in header file. Also there is
> > mismatch with its argument u32 vs MV_u32!
> >
> > Next, MV_U8 is used only at one place (a37xx defines it moreover locally):
> >
> > $ git grep MV_U8
> > a3700/mv_ddr_a3700_wrapper.h:#define MV_U8              u8
> > a38x/mv_ddr_plat.c:     MV_U8 if_id = 0;
> > mv_ddr_atf_wrapper.h:#define MV_U8              u8
> >
> > So replace MV_U8 directly by u8. And same for MV_U32 for a38x code.
> 
> Cool.
> 
> > And ideally, send a pull request to Marvell repo with these fixes (and
> > also with floating point), so code can be synced easily also again in
> > future.
> 
> Would you do that when you have time after DDR4 gets merged?
> 
> Thanks,
> Tony

I could, but I do not have exact changes / patches for that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository
  2023-01-18 18:30         ` Pali Rohár
@ 2023-01-18 21:13           ` Tony Dinh
  0 siblings, 0 replies; 8+ messages in thread
From: Tony Dinh @ 2023-01-18 21:13 UTC (permalink / raw)
  To: Pali Rohár
  Cc: U-Boot Mailing List, Stefan Roese, Marek Beh�n,
	Chris Packham, Jaehoon Chung, Mark Kettenis, Simon Glass,
	Michael Trimarchi, Tom Rini, Marek Behún

On Wed, Jan 18, 2023 at 10:30 AM Pali Rohár <pali@kernel.org> wrote:
>
> On Tuesday 17 January 2023 17:18:43 Tony Dinh wrote:
> > Hi Pali,
> >
> > On Tue, Jan 17, 2023 at 1:25 PM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > Hello!
> > >
> > > On Tuesday 17 January 2023 13:02:46 Tony Dinh wrote:
> > > > Hi Pali,
> > > >
> > > > On Tue, Jan 17, 2023 at 12:35 AM Pali Rohár <pali@kernel.org> wrote:
> > > > >
> > > > > Hello! Thank you for update. It is much better.
> > > > >
> > > > > On Monday 16 January 2023 21:34:39 Tony Dinh wrote:
> > > > > >     This syncs drivers/ddr/marvell/a38x/ with the master branch of repository
> > > > > >     https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell.git
> > > > > >
> > > > > >     up to the commit "mv_ddr: a3700: Use the right size for memset to not overflow"
> > > > > >     d5acc10c287e40cc2feeb28710b92e45c93c702c
> > > > > >
> > > > > >     This patch was created by following steps:
> > > > > >
> > > > > >     1. Replace all a38x files in U-Boot tree by files from upstream github
> > > > > >        Marvell mv-ddr-marvell repository.
> > > > > >
> > > > > >     2. Run following command to omit portions not relevant for a38x, ddr3, and ddr4:
> > > > > >
> > > > > >         files=drivers/ddr/marvell/a38x/*
> > > > > >         sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > > > > >         unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > > > > >             -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > > > > >             -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > > > > >             -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DTRUE $files
> > > > >
> > > > > Do not forget to also update commit message.
> > > >
> > > > Yes, patman extracts and creates the patch description from the commit.
> > >
> > > My reaction was just because you forgot to update undef for a39x.
> >
> > I see. Yes, I did miss that! I only described the changes in v2 description.
> >
> > > > >
> > > > > And btw, commit messages has on each line some leading spaces which is
> > > > > not probably intended.
> > > >
> > > > That was intentional to make the commit description (and patch
> > > > description) more readable. Is it not recommended?
> > >
> > > I'm not sure if we are talking about the same thing. When I read this
> > > your patch I saw that every time, even the first one "This sync drivers/..."
> > > has 4 spaces before word "This". And I'm not sure if this is just my
> > > email client or not and there is some reason for it. Look at indentation
> > > of line "Signed-off-by:" and line "Reference:". Should not be those two
> > > lines at same indentation level? Or I did not understand it? :D
> > >
> > > I agree that adding indentation inside of 1. 2. 3. parts is fully
> > > recommended as it makes text more readable.
> >
> > When I do git log I also see an extra 4 or 8 spaces on each line:) so
> > not sure what we are seeing here. But yes it seems some of the
> > indentation is inconsistent. Will fix that.
> >
> > >
> > > > >
> > > > > >     3. Manually change license to SPDX-License-Identifier
> > > > > >        (upstream license in  upstream github repository contains long license
> > > > > >        texts and U-Boot is using just SPDX-License-Identifier.
> > > > > >
> > > > > >     After applying this patch, a38x ddr3 ddr4 code in upstream Marvell github
> > > > > >     repository and in U-Boot would be fully identical. So in future applying
> > > > > >     above steps could be used to sync code again.
> > > > > >
> > > > > >     The only change in this patch are:
> > > > > >     - Removal of common board_topology_map code using ifdefs in mv_ddr_brd.c
> > > > > >     - Some fixes with include files.
> > > > > >     - Some basic type defines (original from ATF headers) in mv_ddr_plat.c
> > > > > >
> > > > > >     Reference:
> > > > > >     "ddr: marvell: a38x: Sync code with Marvell mv-ddr-marvell repository"
> > > > > >     https://source.denx.de/u-boot/u-boot/-/commit/107c3391b95bcc2ba09a876da4fa0c31b6c1e460
> > > > > >
> > > > > > Signed-off-by: Tony Dinh <mibodhi@gmail.com>
> > > > > > ---
> > > > > >
> > > > > > Changes in v2:
> > > > > > - Modified the filter scrip to explicitly include ARMADA_38X code
> > > > > > and exclude ARMADA_39X code; also remove 64BIT code. Reran it on
> > > > > > drivers/ddr/marvell/a38x/
> > > > > > - Updated script
> > > > > > files=drivers/ddr/marvell/a38x/*
> > > > > > sed 's/#if defined(CONFIG_ARMADA_38X) || defined(CONFIG_ARMADA_39X)/#ifdef TRUE/' -i $files
> > > > >
> > > > > You do not need this sed anymore. CONFIG_ARMADA_39X is explicitly
> > > > > removed and CONFIG_ARMADA_38X already handled by unifdef.
> > > >
> > > > Thanks, I was not sure if unifdef works in that "OR" condition. I will
> > > > update the commit message.
> > >
> > > It should work if at least one of the option in OR condition is
> > > specified with -D on command line. But if you are unsure then it is
> > > better to test it (should be quite easy and fast).
> > >
> > > > >
> > > > > > unifdef -m -UMV_DDR -UMV_DDR_ATF -UCONFIG_APN806 \
> > > > > >                 -UCONFIG_MC_STATIC -UCONFIG_MC_STATIC_PRINT -UCONFIG_PHY_STATIC \
> > > > > >                 -UCONFIG_PHY_STATIC_PRINT -UCONFIG_CUSTOMER_BOARD_SUPPORT \
> > > > > >                 -UCONFIG_A3700 -UA3900 -UA80X0 -UA70X0 -DCONFIG_ARMADA_38X -UCONFIG_ARMADA_39X \
> > > > > >                                 -UCONFIG_64BIT $files
> > > > > > - Remove more dead code files
> > > > > > - Correct SPDX license header
> > > > > >
> > > > > >  drivers/ddr/marvell/a38x/Makefile             |    8 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_debug.c         |  120 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_init.c          |   25 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_init.h          |   14 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_logging_def.h   |   27 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_training.c      |  131 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_training_bist.c |   12 +
> > > > > >  .../a38x/ddr3_training_centralization.c       |    4 +
> > > > > >  drivers/ddr/marvell/a38x/ddr3_training_db.c   |  212 ++
> > > > > >  drivers/ddr/marvell/a38x/ddr3_training_ip.h   |   17 +
> > > > > >  .../ddr/marvell/a38x/ddr3_training_ip_db.h    |   61 +
> > > > > >  .../marvell/a38x/ddr3_training_ip_engine.c    |  145 +
> > > > > >  .../ddr/marvell/a38x/ddr3_training_ip_flow.h  |    5 +
> > > > > >  .../ddr/marvell/a38x/ddr3_training_leveling.c |  135 +
> > > > > >  drivers/ddr/marvell/a38x/dram_if.h            |   13 -
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c |  674 +++++
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h |   59 +
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr4_training.c   |  565 ++++
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr4_training.h   |   32 +
> > > > > >  .../a38x/mv_ddr4_training_calibration.c       | 2336 +++++++++++++++++
> > > > > >  .../a38x/mv_ddr4_training_calibration.h       |   26 +
> > > > > >  .../ddr/marvell/a38x/mv_ddr4_training_db.c    |  545 ++++
> > > > > >  .../marvell/a38x/mv_ddr4_training_leveling.c  |  441 ++++
> > > > > >  .../marvell/a38x/mv_ddr4_training_leveling.h  |   11 +
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr_plat.c        |  249 ++
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr_plat.h        |   11 +
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr_regs.h        |   59 +
> > > > > >  drivers/ddr/marvell/a38x/mv_ddr_topology.h    |   72 +
> > > > > >  28 files changed, 5996 insertions(+), 13 deletions(-)
> > > > > >  delete mode 100644 drivers/ddr/marvell/a38x/dram_if.h
> > > > >
> > > > > I see that you are removing some existing file. If it is not needed
> > > > > neither for DDR3 nor for DDR4 then please remove it in separate commit
> > > > > or patch. So we do not mix different things into one commit.
> > > >
> > > > Instead of making a different commit, will it work if we list the
> > > > files being removed in this commit message? It is part of removing
> > > > dead code.
> > >
> > > I do not know. I'm always trying to put different thing into different
> > > commits. Reason is that if in some case it would be needed to revert
> > > commit then unrelated cleanup does not need to be reverted :-)
> >
> > There is only one existing file removed (dram_if.h).
>
> It looks like that this file is already not needed. So in separate patch
> this file can be removed from u-boot repository.

Thanks Stefan and Pali for the review. I will send in the v3 patch and
later a separate patch to remove the file.

All the best,
Tony

>
> > The other are new
> > files that become dead code after we run the filter script. Do you
> > think those should also be kept in this patch, and then we'll have a
> > cleanup patch later? The alternative is I can just list the
> > new-but-deleted files in the commit description so they can be kept
> > tracked of (easier for the future code sync if we know what should be
> > candidates for removal).
> >
> > >
> > > > >
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.c
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_mpr_pda_if.h
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.c
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training.h
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.c
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_calibration.h
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_db.c
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.c
> > > > > >  create mode 100644 drivers/ddr/marvell/a38x/mv_ddr4_training_leveling.h
> > > > > ...
> > > > > > diff --git a/drivers/ddr/marvell/a38x/mv_ddr_plat.c b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > > > index 7c7bce73a3..16d177b42f 100644
> > > > > > --- a/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > > > +++ b/drivers/ddr/marvell/a38x/mv_ddr_plat.c
> > > > > > @@ -12,6 +12,11 @@
> > > > > >  #define DDR_INTERFACES_NUM           1
> > > > > >  #define DDR_INTERFACE_OCTETS_NUM     5
> > > > > >
> > > > > > +/* These were defined in ATF area that was stripped out */
> > > > > > +#define MV_STATUS    int
> > > > > > +#define MV_U32               u32
> > > > > > +#define MV_U8                u8
> > > > > > +
> > > > >
> > > > > This is something new which you added? Because I do not see it in
> > > > > Marvell code.
> > > >
> > > > Yes, those were in the original code after the initial copying from
> > > > the Marvell repo.
> > > >
> > > > # grep -E '(MV_U32|MV_STATUS|MV_U8)' *.[ch] a38x/*.[ch]
> > > >
> > > > ddr_init.c:MV_U32 ddr_init(void)
> > > > mv_ddr_atf_wrapper.h:#define MV_STATUS int
> > > > mv_ddr_atf_wrapper.h:#define MV_U8 u8
> > > > mv_ddr_atf_wrapper.h:#define MV_U32 u32
> > > > a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> > > > a38x/mv_ddr_plat.c: MV_STATUS status = MV_OK;
> > > > a38x/mv_ddr_plat.c: MV_U8 if_id = 0;
> > > > a38x/mv_ddr_plat.c: MV_U32 read_data[MAX_INTERFACE_NUM];
> > > > a38x/mv_ddr_plat.c: MV_U32 cal_n = 0, cal_p = 0;
> > > >
> > > > Those 3 are defined in mv_ddr_atf_wrapper.h and used in mv_ddr_plat.c
> > > > (after we ran the filter script, this file is the only place that
> > > > needs those 3 defines). Since we removed the ATF code, we need to
> > > > define them here in mv_ddr_plat.c. Is this OK or do you have any
> > > > suggestions for a better approach?
> > >
> > > Hmm... You found another bug in Marvell code:
> > >
> > > $ git grep mv_ddr4_calibration_validate
> > > a38x/mv_ddr_plat.c:MV_STATUS mv_ddr4_calibration_validate(MV_U32 dev_num)
> > > apn806/mv_ddr_plat.c:int mv_ddr4_calibration_validate(u32 dev_num)
> > > mv_ddr4_training.c:             return mv_ddr4_calibration_validate(dev_num);
> > > mv_ddr4_training.h:int mv_ddr4_calibration_validate(u32 dev_num);
> > >
> > > That function mv_ddr4_calibration_validate() should return int type (as
> > > defined in header file) and not MV_STATUS type. So rather fix return
> > > type of the function to match what is in header file. Also there is
> > > mismatch with its argument u32 vs MV_u32!
> > >
> > > Next, MV_U8 is used only at one place (a37xx defines it moreover locally):
> > >
> > > $ git grep MV_U8
> > > a3700/mv_ddr_a3700_wrapper.h:#define MV_U8              u8
> > > a38x/mv_ddr_plat.c:     MV_U8 if_id = 0;
> > > mv_ddr_atf_wrapper.h:#define MV_U8              u8
> > >
> > > So replace MV_U8 directly by u8. And same for MV_U32 for a38x code.
> >
> > Cool.
> >
> > > And ideally, send a pull request to Marvell repo with these fixes (and
> > > also with floating point), so code can be synced easily also again in
> > > future.
> >
> > Would you do that when you have time after DDR4 gets merged?
> >
> > Thanks,
> > Tony
>
> I could, but I do not have exact changes / patches for that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-18 21:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-17  5:34 [PATCH v2] ddr: marvell: a38x: Add support for DDR4 from Marvell mv-ddr-marvell repository Tony Dinh
2023-01-17  8:35 ` Pali Rohár
2023-01-17 21:02   ` Tony Dinh
2023-01-17 21:25     ` Pali Rohár
2023-01-18  1:18       ` Tony Dinh
2023-01-18 18:30         ` Pali Rohár
2023-01-18 21:13           ` Tony Dinh
2023-01-18 11:01     ` Stefan Roese

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.