All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
@ 2016-03-31 15:26 Matt Redfearn
  2016-03-31 15:26 ` [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver " Matt Redfearn
  2016-04-14 12:45 ` [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings " Ulf Hansson
  0 siblings, 2 replies; 32+ messages in thread
From: Matt Redfearn @ 2016-03-31 15:26 UTC (permalink / raw)
  To: ulf.hansson
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Matt Redfearn

From: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>

Add Device Tree binding document for Octeon MMC controller. The driver
is in a following patch.

The MMC controller can be connected to up to 4 "slots" which may be
eMMC, MMC or SD card devices. As there is a single controller, each
available slot is represented as a child node of the controller.

This is a similar concept to the atmel-mci driver.

Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Chandrakala Chavva <cchavva@caviumnetworks.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com>
Signed-off-by: Leonid Rosenboim <lrosenboim@caviumnetworks.com>
Signed-off-by: Peter Swain <pswain@cavium.com>
Signed-off-by: Aaron Williams <aaron.williams@cavium.com>
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Rob Herring <robh@kernel.org>
---
v7: No changes

v6:
- Split up patch
- Moved device tree fixup to platform code
---
 .../devicetree/bindings/mmc/octeon-mmc.txt         | 79 ++++++++++++++++++++++
 1 file changed, 79 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mmc/octeon-mmc.txt

diff --git a/Documentation/devicetree/bindings/mmc/octeon-mmc.txt b/Documentation/devicetree/bindings/mmc/octeon-mmc.txt
new file mode 100644
index 000000000000..d2c576d9ad65
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/octeon-mmc.txt
@@ -0,0 +1,79 @@
+* OCTEON SD/MMC Host Controller
+
+This controller is present on some members of the Cavium OCTEON SoC
+family, provide an interface for eMMC, MMC and SD devices.  There is a
+single controller that may have several "slots" connected.  These
+slots appear as children of the main controller node.
+The DMA engine is an integral part of the controller block.
+
+1) MMC node
+
+Required properties:
+- compatible : Should be "cavium,octeon-6130-mmc" or "cavium,octeon-7890-mmc"
+- reg : Two entries:
+	1) The base address of the MMC controller register bank.
+	2) The base address of the MMC DMA engine register bank.
+- interrupts :
+	For "cavium,octeon-6130-mmc": two entries:
+	1) The MMC controller interrupt line.
+	2) The MMC DMA engine interrupt line.
+	For "cavium,octeon-7890-mmc": nine entries:
+	1) The next block transfer of a multiblock transfer has completed (BUF_DONE)
+	2) Operation completed successfully (CMD_DONE).
+	3) DMA transfer completed successfully (DMA_DONE).
+	4) Operation encountered an error (CMD_ERR).
+	5) DMA transfer encountered an error (DMA_ERR).
+	6) Switch operation completed successfully (SWITCH_DONE).
+	7) Switch operation encountered an error (SWITCH_ERR).
+	8) Internal DMA engine request completion interrupt (DONE).
+	9) Internal DMA FIFO underflow (FIFO).
+- #address-cells : Must be <1>
+- #size-cells : Must be <0>
+
+The node contains child nodes for each slot that the platform uses.
+
+Example:
+mmc@1180000002000 {
+	compatible = "cavium,octeon-6130-mmc";
+	reg = <0x11800 0x00002000 0x0 0x100>,
+		<0x11800 0x00000168 0x0 0x20>;
+	#address-cells = <1>;
+	#size-cells = <0>;
+	/* EMM irq, DMA irq */
+	interrupts = <1 19>, <0 63>;
+
+	[ child node definitions...]
+};
+
+
+2) Slot nodes
+Properties in mmc.txt apply to each slot node that the platform uses.
+
+Required properties:
+- reg : The slot number.
+
+Optional properties:
+- cavium,cmd-clk-skew : the amount of delay (in pS) past the clock edge
+	to sample the command pin.
+- cavium,dat-clk-skew : the amount of delay (in pS) past the clock edge
+	to sample the data pin.
+
+Example:
+	mmc@1180000002000 {
+		compatible = "cavium,octeon-6130-mmc";
+		reg = <0x11800 0x00002000 0x0 0x100>,
+		      <0x11800 0x00000168 0x0 0x20>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		/* EMM irq, DMA irq */
+		interrupts = <1 19>, <0 63>;
+
+		mmc-slot@0 {
+			reg = <0>;
+			max-frequency = <20000000>;
+			bus-width = <8>;
+			vmmc-supply = <&reg_vmmc3>;
+			cd-gpios = <&gpio 9 0>;
+			wp-gpios = <&gpio 10 0>;
+		};
+	};
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-03-31 15:26 [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller Matt Redfearn
@ 2016-03-31 15:26 ` Matt Redfearn
  2016-04-19 20:46   ` Arnd Bergmann
  2016-04-14 12:45 ` [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings " Ulf Hansson
  1 sibling, 1 reply; 32+ messages in thread
From: Matt Redfearn @ 2016-03-31 15:26 UTC (permalink / raw)
  To: ulf.hansson
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Matt Redfearn

From: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>

The OCTEON MMC controller is currently found on cn61XX and cn71XX
devices.  Device parameters are configured from device tree data.

eMMC, MMC and SD devices are supported.

Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Chandrakala Chavva <cchavva@caviumnetworks.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com>
Signed-off-by: Leonid Rosenboim <lrosenboim@caviumnetworks.com>
Signed-off-by: Peter Swain <pswain@cavium.com>
Signed-off-by: Aaron Williams <aaron.williams@cavium.com>
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
---
v7:
- Revert to legacy device tree bindings handled in driver
- Tone down the warning printed when legacy bindings in use

v6:
- Split up patch
- Moved device tree fixup to platform code

v5:
Incoroprate comments from review
http://patchwork.linux-mips.org/patch/9558/
- Use standard <bus-width> property instead of <cavium,bus-max-width>.
- Use standard <max-frequency> property instead of <spi-max-frequency>.
- Add octeon_mmc_of_parse_legacy function to deal with the above
  properties, since many devices have shipped with those properties
  embedded in firmware.
- Allow the <vmmc-supply> binding in addition to the legacy
  <gpios-power>.
- Remove the secondary driver for each slot.
- Use core gpio cd/wp handling

Tested on Rhino labs UTM8, Cavium CN7130.

For reference, the binding in the shipped devices is:
mmc: mmc@1180000002000 {
	compatible = "cavium,octeon-6130-mmc";
	reg = <0x11800 0x00002000 0x0 0x100>,
		<0x11800 0x00000168 0x0 0x20>;
	#address-cells = <1>;
	#size-cells = <0>;
	/* EMM irq, DMA irq */
	interrupts = <1 19>, <0 63>;

	/* The board only has a single MMC slot */
	mmc-slot@2 {
		compatible = "cavium,octeon-6130-mmc-slot";
		reg = <2>;
		voltage-ranges = <3300 3300>;
		spi-max-frequency = <26000000>;
		/* Power on GPIO 8, active high */
		/* power-gpios = <&gpio 8 0>; */
		power-gpios = <&gpio 8 1>;

	/*      spi-max-frequency = <52000000>; */
		/* bus width can be 1, 4 or 8 */
		cavium,bus-max-width = <8>;
	};
	mmc-slot@0 {
		compatible = "cavium,octeon-6130-mmc-slot";
		reg = <0>;
		voltage-ranges = <3300 3300>;
		spi-max-frequency = <26000000>;
		/* non-removable; */
		bus-width = <8>;
		/* bus width can be 1, 4 or 8 */
		cavium,bus-max-width = <8>;
	};
};

v3:
https://lkml.kernel.org/g/<1425567033-31236-1-git-send-email-aleksey.makarov@auriga.com>

Changes in v4:
- The sparse error discovered by Aaro Koskinen has been fixed
- Other sparse warnings have been silenced

Changes in v3:
- Rebased to v4.0-rc2
- Use gpiod_*() functions instead of legacy gpio
- Cosmetic changes

Changes in v2: All the fixes suggested by Mark Rutland were implemented:
- Device tree parsing has been fixed
- Device tree docs have been fixed
- Comment about errata workaroud has been added
---
 drivers/mmc/host/Kconfig      |   10 +
 drivers/mmc/host/Makefile     |    1 +
 drivers/mmc/host/octeon_mmc.c | 1408 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1419 insertions(+)
 create mode 100644 drivers/mmc/host/octeon_mmc.c

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 04feea8354cb..f007afb39bd3 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -331,6 +331,16 @@ config MMC_SDHCI_IPROC
 
 	  If unsure, say N.
 
+config MMC_OCTEON
+	tristate "Cavium OCTEON Multimedia Card Interface support"
+	depends on CAVIUM_OCTEON_SOC
+	help
+	  This selects Cavium OCTEON Multimedia card Interface.
+	  If you have an OCTEON board with a Multimedia Card slot,
+	  say Y or M here.
+
+	  If unsure, say N.
+
 config MMC_MOXART
 	tristate "MOXART SD/MMC Host Controller support"
 	depends on ARCH_MOXART && MMC
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index af918d261ff9..c43bd7dcaa4b 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_MMC_SDHCI_SPEAR)	+= sdhci-spear.o
 obj-$(CONFIG_MMC_WBSD)		+= wbsd.o
 obj-$(CONFIG_MMC_AU1X)		+= au1xmmc.o
 obj-$(CONFIG_MMC_MTK)		+= mtk-sd.o
+obj-$(CONFIG_MMC_OCTEON)	+= octeon_mmc.o
 obj-$(CONFIG_MMC_OMAP)		+= omap.o
 obj-$(CONFIG_MMC_OMAP_HS)	+= omap_hsmmc.o
 obj-$(CONFIG_MMC_ATMELMCI)	+= atmel-mci.o
diff --git a/drivers/mmc/host/octeon_mmc.c b/drivers/mmc/host/octeon_mmc.c
new file mode 100644
index 000000000000..2535455b1070
--- /dev/null
+++ b/drivers/mmc/host/octeon_mmc.c
@@ -0,0 +1,1408 @@
+/*
+ * Driver for MMC and SSD cards for Cavium OCTEON SOCs.
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2012-2015 Cavium Inc.
+ */
+#include <linux/platform_device.h>
+#include <linux/of_platform.h>
+#include <linux/scatterlist.h>
+#include <linux/interrupt.h>
+#include <linux/blkdev.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/clk.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/of.h>
+
+#include <linux/mmc/card.h>
+#include <linux/mmc/host.h>
+#include <linux/mmc/mmc.h>
+#include <linux/mmc/sd.h>
+#include <linux/mmc/slot-gpio.h>
+#include <net/irda/parameters.h>
+#include <linux/gpio/consumer.h>
+#include <linux/regulator/consumer.h>
+
+#include <asm/byteorder.h>
+#include <asm/octeon/octeon.h>
+#include <asm/octeon/cvmx-mio-defs.h>
+
+#define DRV_NAME	"octeon_mmc"
+
+#define OCTEON_MAX_MMC			4
+
+#define OCT_MIO_NDF_DMA_CFG		0x00
+#define OCT_MIO_EMM_DMA_ADR		0x08
+
+#define OCT_MIO_EMM_CFG			0x00
+#define OCT_MIO_EMM_SWITCH		0x48
+#define OCT_MIO_EMM_DMA			0x50
+#define OCT_MIO_EMM_CMD			0x58
+#define OCT_MIO_EMM_RSP_STS		0x60
+#define OCT_MIO_EMM_RSP_LO		0x68
+#define OCT_MIO_EMM_RSP_HI		0x70
+#define OCT_MIO_EMM_INT			0x78
+#define OCT_MIO_EMM_INT_EN		0x80
+#define OCT_MIO_EMM_WDOG		0x88
+#define OCT_MIO_EMM_SAMPLE		0x90
+#define OCT_MIO_EMM_STS_MASK		0x98
+#define OCT_MIO_EMM_RCA			0xa0
+#define OCT_MIO_EMM_BUF_IDX		0xe0
+#define OCT_MIO_EMM_BUF_DAT		0xe8
+
+#define CVMX_MIO_BOOT_CTL CVMX_ADD_IO_SEG(0x00011800000000D0ull)
+
+struct octeon_mmc_host {
+	u64	base;
+	u64	ndf_base;
+	u64	emm_cfg;
+	u64	n_minus_one;  /* OCTEON II workaround location */
+	int	last_slot;
+
+	struct semaphore mmc_serializer;
+	struct mmc_request	*current_req;
+	unsigned int		linear_buf_size;
+	void			*linear_buf;
+	struct sg_mapping_iter smi;
+	int sg_idx;
+	bool dma_active;
+
+	struct platform_device	*pdev;
+	struct gpio_desc *global_pwr_gpiod;
+	bool dma_err_pending;
+	bool need_bootbus_lock;
+	bool big_dma_addr;
+	bool need_irq_handler_lock;
+	spinlock_t irq_handler_lock;
+
+	struct octeon_mmc_slot	*slot[OCTEON_MAX_MMC];
+};
+
+struct octeon_mmc_slot {
+	struct mmc_host         *mmc;	/* slot-level mmc_core object */
+	struct octeon_mmc_host	*host;	/* common hw for all 4 slots */
+
+	unsigned int		clock;
+	unsigned int		sclock;
+
+	u64			cached_switch;
+	u64			cached_rca;
+
+	unsigned int		cmd_cnt; /* sample delay */
+	unsigned int		dat_cnt; /* sample delay */
+
+	int			bus_id;
+
+	/* Legacy property - in future mmc->supply.vmmc should be used */
+	struct gpio_desc	*pwr_gpiod;
+};
+
+static int bb_size = 1 << 18;
+module_param(bb_size, int, S_IRUGO);
+MODULE_PARM_DESC(bb_size,
+		 "Size of DMA linearizing buffer (max transfer size).");
+
+static int ddr = 2;
+module_param(ddr, int, S_IRUGO);
+MODULE_PARM_DESC(ddr,
+		 "enable DoubleDataRate clocking: 0=no, 1=always, 2=at spi-max-frequency/2");
+
+#if 0
+#define octeon_mmc_dbg trace_printk
+#else
+static inline void octeon_mmc_dbg(const char *s, ...) { }
+#endif
+
+static void octeon_mmc_acquire_bus(struct octeon_mmc_host *host)
+{
+	if (host->need_bootbus_lock) {
+		down(&octeon_bootbus_sem);
+		/* On cn70XX switch the mmc unit onto the bus. */
+		if (OCTEON_IS_MODEL(OCTEON_CN70XX))
+			cvmx_write_csr(CVMX_MIO_BOOT_CTL, 0);
+	} else {
+		down(&host->mmc_serializer);
+	}
+}
+
+static void octeon_mmc_release_bus(struct octeon_mmc_host *host)
+{
+	if (host->need_bootbus_lock)
+		up(&octeon_bootbus_sem);
+	else
+		up(&host->mmc_serializer);
+}
+
+struct octeon_mmc_cr_type {
+	u8 ctype;
+	u8 rtype;
+};
+
+/*
+ * The OCTEON MMC host hardware assumes that all commands have fixed
+ * command and response types.  These are correct if MMC devices are
+ * being used.  However, non-MMC devices like SD use command and
+ * response types that are unexpected by the host hardware.
+ *
+ * The command and response types can be overridden by supplying an
+ * XOR value that is applied to the type.  We calculate the XOR value
+ * from the values in this table and the flags passed from the MMC
+ * core.
+ */
+static struct octeon_mmc_cr_type octeon_mmc_cr_types[] = {
+	{0, 0},		/* CMD0 */
+	{0, 3},		/* CMD1 */
+	{0, 2},		/* CMD2 */
+	{0, 1},		/* CMD3 */
+	{0, 0},		/* CMD4 */
+	{0, 1},		/* CMD5 */
+	{0, 1},		/* CMD6 */
+	{0, 1},		/* CMD7 */
+	{1, 1},		/* CMD8 */
+	{0, 2},		/* CMD9 */
+	{0, 2},		/* CMD10 */
+	{1, 1},		/* CMD11 */
+	{0, 1},		/* CMD12 */
+	{0, 1},		/* CMD13 */
+	{1, 1},		/* CMD14 */
+	{0, 0},		/* CMD15 */
+	{0, 1},		/* CMD16 */
+	{1, 1},		/* CMD17 */
+	{1, 1},		/* CMD18 */
+	{3, 1},		/* CMD19 */
+	{2, 1},		/* CMD20 */
+	{0, 0},		/* CMD21 */
+	{0, 0},		/* CMD22 */
+	{0, 1},		/* CMD23 */
+	{2, 1},		/* CMD24 */
+	{2, 1},		/* CMD25 */
+	{2, 1},		/* CMD26 */
+	{2, 1},		/* CMD27 */
+	{0, 1},		/* CMD28 */
+	{0, 1},		/* CMD29 */
+	{1, 1},		/* CMD30 */
+	{1, 1},		/* CMD31 */
+	{0, 0},		/* CMD32 */
+	{0, 0},		/* CMD33 */
+	{0, 0},		/* CMD34 */
+	{0, 1},		/* CMD35 */
+	{0, 1},		/* CMD36 */
+	{0, 0},		/* CMD37 */
+	{0, 1},		/* CMD38 */
+	{0, 4},		/* CMD39 */
+	{0, 5},		/* CMD40 */
+	{0, 0},		/* CMD41 */
+	{2, 1},		/* CMD42 */
+	{0, 0},		/* CMD43 */
+	{0, 0},		/* CMD44 */
+	{0, 0},		/* CMD45 */
+	{0, 0},		/* CMD46 */
+	{0, 0},		/* CMD47 */
+	{0, 0},		/* CMD48 */
+	{0, 0},		/* CMD49 */
+	{0, 0},		/* CMD50 */
+	{0, 0},		/* CMD51 */
+	{0, 0},		/* CMD52 */
+	{0, 0},		/* CMD53 */
+	{0, 0},		/* CMD54 */
+	{0, 1},		/* CMD55 */
+	{0xff, 0xff},	/* CMD56 */
+	{0, 0},		/* CMD57 */
+	{0, 0},		/* CMD58 */
+	{0, 0},		/* CMD59 */
+	{0, 0},		/* CMD60 */
+	{0, 0},		/* CMD61 */
+	{0, 0},		/* CMD62 */
+	{0, 0}		/* CMD63 */
+};
+
+struct octeon_mmc_cr_mods {
+	u8 ctype_xor;
+	u8 rtype_xor;
+};
+
+/*
+ * The functions below are used for the EMMC-17978 workaround.
+ *
+ * Due to an imperfection in the design of the MMC bus hardware,
+ * the 2nd to last cache block of a DMA read must be locked into the L2 Cache.
+ * Otherwise, data corruption may occur.
+ */
+
+static inline void *phys_to_ptr(u64 address)
+{
+	return (void *)(address | (1ull<<63)); /* XKPHYS */
+}
+
+/**
+ * Lock a single line into L2. The line is zeroed before locking
+ * to make sure no dram accesses are made.
+ *
+ * @addr   Physical address to lock
+ */
+static void l2c_lock_line(u64 addr)
+{
+	char *addr_ptr = phys_to_ptr(addr);
+
+	asm volatile (
+		"cache 31, %[line]"	/* Unlock the line */
+		:: [line] "m" (*addr_ptr));
+}
+
+/**
+ * Locks a memory region in the L2 cache
+ *
+ * @start - start address to begin locking
+ * @len - length in bytes to lock
+ */
+static void l2c_lock_mem_region(u64 start, u64 len)
+{
+	u64 end;
+
+	/* Round start/end to cache line boundaries */
+	end = ALIGN(start + len - 1, CVMX_CACHE_LINE_SIZE);
+	start = ALIGN(start, CVMX_CACHE_LINE_SIZE);
+
+	while (start <= end) {
+		l2c_lock_line(start);
+		start += CVMX_CACHE_LINE_SIZE;
+	}
+	asm volatile("sync");
+}
+
+/**
+ * Unlock a single line in the L2 cache.
+ *
+ * @addr	Physical address to unlock
+ *
+ * Return Zero on success
+ */
+static void l2c_unlock_line(u64 addr)
+{
+	char *addr_ptr = phys_to_ptr(addr);
+
+	asm volatile (
+		"cache 23, %[line]"	/* Unlock the line */
+		:: [line] "m" (*addr_ptr));
+}
+
+/**
+ * Unlock a memory region in the L2 cache
+ *
+ * @start - start address to unlock
+ * @len - length to unlock in bytes
+ */
+static void l2c_unlock_mem_region(u64 start, u64 len)
+{
+	u64 end;
+
+	/* Round start/end to cache line boundaries */
+	end = ALIGN(start + len - 1, CVMX_CACHE_LINE_SIZE);
+	start = ALIGN(start, CVMX_CACHE_LINE_SIZE);
+
+	while (start <= end) {
+		l2c_unlock_line(start);
+		start += CVMX_CACHE_LINE_SIZE;
+	}
+}
+
+static struct octeon_mmc_cr_mods octeon_mmc_get_cr_mods(struct mmc_command *cmd)
+{
+	struct octeon_mmc_cr_type *cr;
+	u8 desired_ctype, hardware_ctype;
+	u8 desired_rtype, hardware_rtype;
+	struct octeon_mmc_cr_mods r;
+
+	desired_ctype = desired_rtype = 0;
+
+	cr = octeon_mmc_cr_types + (cmd->opcode & 0x3f);
+	hardware_ctype = cr->ctype;
+	hardware_rtype = cr->rtype;
+	if (cmd->opcode == MMC_GEN_CMD)
+		hardware_ctype = (cmd->arg & 1) ? 1 : 2;
+
+	switch (mmc_cmd_type(cmd)) {
+	case MMC_CMD_ADTC:
+		desired_ctype = (cmd->data->flags & MMC_DATA_WRITE) ? 2 : 1;
+		break;
+	case MMC_CMD_AC:
+	case MMC_CMD_BC:
+	case MMC_CMD_BCR:
+		desired_ctype = 0;
+		break;
+	}
+
+	switch (mmc_resp_type(cmd)) {
+	case MMC_RSP_NONE:
+		desired_rtype = 0;
+		break;
+	case MMC_RSP_R1:/* MMC_RSP_R5, MMC_RSP_R6, MMC_RSP_R7 */
+	case MMC_RSP_R1B:
+		desired_rtype = 1;
+		break;
+	case MMC_RSP_R2:
+		desired_rtype = 2;
+		break;
+	case MMC_RSP_R3: /* MMC_RSP_R4 */
+		desired_rtype = 3;
+		break;
+	}
+	r.ctype_xor = desired_ctype ^ hardware_ctype;
+	r.rtype_xor = desired_rtype ^ hardware_rtype;
+	return r;
+}
+
+static bool octeon_mmc_switch_val_changed(struct octeon_mmc_slot *slot,
+					  u64 new_val)
+{
+	/* Match BUS_ID, HS_TIMING, BUS_WIDTH, POWER_CLASS, CLK_HI, CLK_LO */
+	u64 m = 0x3001070fffffffffull;
+
+	return (slot->cached_switch & m) != (new_val & m);
+}
+
+static unsigned int octeon_mmc_timeout_to_wdog(struct octeon_mmc_slot *slot,
+					       unsigned int ns)
+{
+	u64 bt = (u64)slot->clock * (u64)ns;
+
+	return (unsigned int)(bt / 1000000000);
+}
+
+static irqreturn_t octeon_mmc_interrupt(int irq, void *dev_id)
+{
+	struct octeon_mmc_host *host = dev_id;
+	union cvmx_mio_emm_int emm_int;
+	struct mmc_request	*req;
+	bool host_done;
+	union cvmx_mio_emm_rsp_sts rsp_sts;
+	unsigned long flags = 0;
+
+	if (host->need_irq_handler_lock)
+		spin_lock_irqsave(&host->irq_handler_lock, flags);
+	else
+		__acquire(&host->irq_handler_lock);
+	emm_int.u64 = cvmx_read_csr(host->base + OCT_MIO_EMM_INT);
+	req = host->current_req;
+	cvmx_write_csr(host->base + OCT_MIO_EMM_INT, emm_int.u64);
+
+	octeon_mmc_dbg("Got interrupt: EMM_INT = 0x%llx\n", emm_int.u64);
+
+	if (!req)
+		goto out;
+
+	rsp_sts.u64 = cvmx_read_csr(host->base + OCT_MIO_EMM_RSP_STS);
+	octeon_mmc_dbg("octeon_mmc_interrupt  MIO_EMM_RSP_STS 0x%llx\n",
+		rsp_sts.u64);
+
+	if (host->dma_err_pending) {
+		host->current_req = NULL;
+		host->dma_err_pending = false;
+		req->done(req);
+		host_done = true;
+		goto no_req_done;
+	}
+
+	if (!host->dma_active && emm_int.s.buf_done && req->data) {
+		unsigned int type = (rsp_sts.u64 >> 7) & 3;
+
+		if (type == 1) {
+			/* Read */
+			int dbuf = rsp_sts.s.dbuf;
+			struct sg_mapping_iter *smi = &host->smi;
+			unsigned int data_len =
+				req->data->blksz * req->data->blocks;
+			unsigned int bytes_xfered;
+			u64 dat = 0;
+			int shift = -1;
+
+			/* Auto inc from offset zero */
+			cvmx_write_csr(host->base + OCT_MIO_EMM_BUF_IDX,
+				(u64)(0x10000 | (dbuf << 6)));
+
+			for (bytes_xfered = 0; bytes_xfered < data_len;) {
+				if (smi->consumed >= smi->length) {
+					if (!sg_miter_next(smi))
+						break;
+					smi->consumed = 0;
+				}
+				if (shift < 0) {
+					dat = cvmx_read_csr(host->base +
+						OCT_MIO_EMM_BUF_DAT);
+					shift = 56;
+				}
+
+				while (smi->consumed < smi->length &&
+					shift >= 0) {
+					((u8 *)(smi->addr))[smi->consumed] =
+						(dat >> shift) & 0xff;
+					bytes_xfered++;
+					smi->consumed++;
+					shift -= 8;
+				}
+			}
+			sg_miter_stop(smi);
+			req->data->bytes_xfered = bytes_xfered;
+			req->data->error = 0;
+		} else if (type == 2) {
+			/* write */
+			req->data->bytes_xfered = req->data->blksz *
+				req->data->blocks;
+			req->data->error = 0;
+		}
+	}
+	host_done = emm_int.s.cmd_done || emm_int.s.dma_done ||
+		emm_int.s.cmd_err || emm_int.s.dma_err;
+	if (host_done && req->done) {
+		if (rsp_sts.s.rsp_bad_sts ||
+		    rsp_sts.s.rsp_crc_err ||
+		    rsp_sts.s.rsp_timeout ||
+		    rsp_sts.s.blk_crc_err ||
+		    rsp_sts.s.blk_timeout ||
+		    rsp_sts.s.dbuf_err) {
+			req->cmd->error = -EILSEQ;
+		} else {
+			req->cmd->error = 0;
+		}
+
+		if (host->dma_active && req->data) {
+			req->data->error = 0;
+			req->data->bytes_xfered = req->data->blocks *
+				req->data->blksz;
+			if (!(req->data->flags & MMC_DATA_WRITE) &&
+				req->data->sg_len > 1) {
+				size_t r = sg_copy_from_buffer(req->data->sg,
+					req->data->sg_len, host->linear_buf,
+					req->data->bytes_xfered);
+				WARN_ON(r != req->data->bytes_xfered);
+			}
+		}
+		if (rsp_sts.s.rsp_val) {
+			u64 rsp_hi;
+			u64 rsp_lo = cvmx_read_csr(
+				host->base + OCT_MIO_EMM_RSP_LO);
+
+			switch (rsp_sts.s.rsp_type) {
+			case 1:
+			case 3:
+				req->cmd->resp[0] = (rsp_lo >> 8) & 0xffffffff;
+				req->cmd->resp[1] = 0;
+				req->cmd->resp[2] = 0;
+				req->cmd->resp[3] = 0;
+				break;
+			case 2:
+				req->cmd->resp[3] = rsp_lo & 0xffffffff;
+				req->cmd->resp[2] = (rsp_lo >> 32) & 0xffffffff;
+				rsp_hi = cvmx_read_csr(host->base +
+					OCT_MIO_EMM_RSP_HI);
+				req->cmd->resp[1] = rsp_hi & 0xffffffff;
+				req->cmd->resp[0] = (rsp_hi >> 32) & 0xffffffff;
+				break;
+			default:
+				octeon_mmc_dbg("octeon_mmc_interrupt unhandled rsp_val %d\n",
+					       rsp_sts.s.rsp_type);
+				break;
+			}
+			octeon_mmc_dbg("octeon_mmc_interrupt  resp %08x %08x %08x %08x\n",
+				       req->cmd->resp[0], req->cmd->resp[1],
+				       req->cmd->resp[2], req->cmd->resp[3]);
+		}
+		if (emm_int.s.dma_err && rsp_sts.s.dma_pend) {
+			/* Try to clean up failed DMA */
+			union cvmx_mio_emm_dma emm_dma;
+
+			emm_dma.u64 =
+				cvmx_read_csr(host->base + OCT_MIO_EMM_DMA);
+			emm_dma.s.dma_val = 1;
+			emm_dma.s.dat_null = 1;
+			emm_dma.s.bus_id = rsp_sts.s.bus_id;
+			cvmx_write_csr(host->base + OCT_MIO_EMM_DMA,
+				       emm_dma.u64);
+			host->dma_err_pending = true;
+			host_done = false;
+			goto no_req_done;
+		}
+
+		host->current_req = NULL;
+		req->done(req);
+	}
+no_req_done:
+	if (host->n_minus_one) {
+		l2c_unlock_mem_region(host->n_minus_one, 512);
+		host->n_minus_one = 0;
+	}
+	if (host_done)
+		octeon_mmc_release_bus(host);
+out:
+	if (host->need_irq_handler_lock)
+		spin_unlock_irqrestore(&host->irq_handler_lock, flags);
+	else
+		__release(&host->irq_handler_lock);
+	return IRQ_RETVAL(emm_int.u64 != 0);
+}
+
+static void octeon_mmc_switch_to(struct octeon_mmc_slot	*slot)
+{
+	struct octeon_mmc_host	*host = slot->host;
+	struct octeon_mmc_slot	*old_slot;
+	union cvmx_mio_emm_switch sw;
+	union cvmx_mio_emm_sample samp;
+
+	if (slot->bus_id == host->last_slot)
+		goto out;
+
+	if (host->last_slot >= 0 && host->slot[host->last_slot]) {
+		old_slot = host->slot[host->last_slot];
+		old_slot->cached_switch =
+			cvmx_read_csr(host->base + OCT_MIO_EMM_SWITCH);
+		old_slot->cached_rca =
+			cvmx_read_csr(host->base + OCT_MIO_EMM_RCA);
+	}
+	cvmx_write_csr(host->base + OCT_MIO_EMM_RCA, slot->cached_rca);
+	sw.u64 = slot->cached_switch;
+	sw.s.bus_id = 0;
+	cvmx_write_csr(host->base + OCT_MIO_EMM_SWITCH, sw.u64);
+	sw.s.bus_id = slot->bus_id;
+	cvmx_write_csr(host->base + OCT_MIO_EMM_SWITCH, sw.u64);
+
+	samp.u64 = 0;
+	samp.s.cmd_cnt = slot->cmd_cnt;
+	samp.s.dat_cnt = slot->dat_cnt;
+	cvmx_write_csr(host->base + OCT_MIO_EMM_SAMPLE, samp.u64);
+out:
+	host->last_slot = slot->bus_id;
+}
+
+static void octeon_mmc_dma_request(struct mmc_host *mmc,
+				   struct mmc_request *mrq)
+{
+	struct octeon_mmc_slot	*slot;
+	struct octeon_mmc_host	*host;
+	struct mmc_command *cmd;
+	struct mmc_data *data;
+	union cvmx_mio_emm_int emm_int;
+	union cvmx_mio_emm_dma emm_dma;
+	union cvmx_mio_ndf_dma_cfg dma_cfg;
+
+	cmd = mrq->cmd;
+	if (mrq->data == NULL || mrq->data->sg == NULL || !mrq->data->sg_len ||
+	    mrq->stop == NULL || mrq->stop->opcode != MMC_STOP_TRANSMISSION) {
+		dev_err(&mmc->card->dev,
+			"Error: octeon_mmc_dma_request no data\n");
+		cmd->error = -EINVAL;
+		if (mrq->done)
+			mrq->done(mrq);
+		return;
+	}
+
+	slot = mmc_priv(mmc);
+	host = slot->host;
+
+	/* Only a single user of the bootbus at a time. */
+	octeon_mmc_acquire_bus(host);
+
+	octeon_mmc_switch_to(slot);
+
+	data = mrq->data;
+
+	if (data->timeout_ns) {
+		cvmx_write_csr(host->base + OCT_MIO_EMM_WDOG,
+			octeon_mmc_timeout_to_wdog(slot, data->timeout_ns));
+		octeon_mmc_dbg("OCT_MIO_EMM_WDOG %llu\n",
+			cvmx_read_csr(host->base + OCT_MIO_EMM_WDOG));
+	}
+
+	WARN_ON(host->current_req);
+	host->current_req = mrq;
+
+	host->sg_idx = 0;
+
+	WARN_ON(data->blksz * data->blocks > host->linear_buf_size);
+
+	if ((data->flags & MMC_DATA_WRITE) && data->sg_len > 1) {
+		size_t r = sg_copy_to_buffer(data->sg, data->sg_len,
+			 host->linear_buf, data->blksz * data->blocks);
+		WARN_ON(data->blksz * data->blocks != r);
+	}
+
+	dma_cfg.u64 = 0;
+	dma_cfg.s.en = 1;
+	dma_cfg.s.rw = (data->flags & MMC_DATA_WRITE) ? 1 : 0;
+#ifdef __LITTLE_ENDIAN
+	dma_cfg.s.endian = 1;
+#endif
+	dma_cfg.s.size = ((data->blksz * data->blocks) / 8) - 1;
+	if (!host->big_dma_addr) {
+		if (data->sg_len > 1)
+			dma_cfg.s.adr = virt_to_phys(host->linear_buf);
+		else
+			dma_cfg.s.adr = sg_phys(data->sg);
+	}
+	cvmx_write_csr(host->ndf_base + OCT_MIO_NDF_DMA_CFG, dma_cfg.u64);
+	octeon_mmc_dbg("MIO_NDF_DMA_CFG: %016llx\n",
+		(unsigned long long)dma_cfg.u64);
+	if (host->big_dma_addr) {
+		u64 addr;
+
+		if (data->sg_len > 1)
+			addr = virt_to_phys(host->linear_buf);
+		else
+			addr = sg_phys(data->sg);
+		cvmx_write_csr(host->ndf_base + OCT_MIO_EMM_DMA_ADR, addr);
+		octeon_mmc_dbg("MIO_EMM_DMA_ADR: %016llx\n",
+			(unsigned long long)addr);
+	}
+
+	emm_dma.u64 = 0;
+	emm_dma.s.bus_id = slot->bus_id;
+	emm_dma.s.dma_val = 1;
+	emm_dma.s.sector = mmc_card_blockaddr(mmc->card) ? 1 : 0;
+	emm_dma.s.rw = (data->flags & MMC_DATA_WRITE) ? 1 : 0;
+	if (mmc_card_mmc(mmc->card) ||
+	    (mmc_card_sd(mmc->card) &&
+		(mmc->card->scr.cmds & SD_SCR_CMD23_SUPPORT)))
+		emm_dma.s.multi = 1;
+	emm_dma.s.block_cnt = data->blocks;
+	emm_dma.s.card_addr = cmd->arg;
+
+	emm_int.u64 = 0;
+	emm_int.s.dma_done = 1;
+	emm_int.s.cmd_err = 1;
+	emm_int.s.dma_err = 1;
+	/* Clear the bit. */
+	cvmx_write_csr(host->base + OCT_MIO_EMM_INT, emm_int.u64);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_INT_EN, emm_int.u64);
+	host->dma_active = true;
+
+	if ((OCTEON_IS_MODEL(OCTEON_CN6XXX) ||
+		OCTEON_IS_MODEL(OCTEON_CNF7XXX)) &&
+	    cmd->opcode == MMC_WRITE_MULTIPLE_BLOCK &&
+	    (data->blksz * data->blocks) > 1024) {
+		host->n_minus_one = dma_cfg.s.adr +
+			(data->blksz * data->blocks) - 1024;
+		l2c_lock_mem_region(host->n_minus_one, 512);
+	}
+
+	if (mmc->card && mmc_card_sd(mmc->card))
+		cvmx_write_csr(host->base + OCT_MIO_EMM_STS_MASK,
+			0x00b00000ull);
+	else
+		cvmx_write_csr(host->base + OCT_MIO_EMM_STS_MASK,
+			0xe4f90080ull);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_DMA, emm_dma.u64);
+	octeon_mmc_dbg("MIO_EMM_DMA: %llx\n", emm_dma.u64);
+}
+
+static void octeon_mmc_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	struct octeon_mmc_slot	*slot;
+	struct octeon_mmc_host	*host;
+	struct mmc_command *cmd;
+	union cvmx_mio_emm_int emm_int;
+	union cvmx_mio_emm_cmd emm_cmd;
+	struct octeon_mmc_cr_mods mods;
+
+	cmd = mrq->cmd;
+
+	if (cmd->opcode == MMC_READ_MULTIPLE_BLOCK ||
+		cmd->opcode == MMC_WRITE_MULTIPLE_BLOCK) {
+		octeon_mmc_dma_request(mmc, mrq);
+		return;
+	}
+
+	mods = octeon_mmc_get_cr_mods(cmd);
+
+	slot = mmc_priv(mmc);
+	host = slot->host;
+
+	/* Only a single user of the bootbus at a time. */
+	octeon_mmc_acquire_bus(host);
+
+	octeon_mmc_switch_to(slot);
+
+	WARN_ON(host->current_req);
+	host->current_req = mrq;
+
+	emm_int.u64 = 0;
+	emm_int.s.cmd_done = 1;
+	emm_int.s.cmd_err = 1;
+	if (cmd->data) {
+		octeon_mmc_dbg("command has data\n");
+		if (cmd->data->flags & MMC_DATA_READ) {
+			sg_miter_start(&host->smi, mrq->data->sg,
+				       mrq->data->sg_len,
+				       SG_MITER_ATOMIC | SG_MITER_TO_SG);
+		} else {
+			struct sg_mapping_iter *smi = &host->smi;
+			unsigned int data_len =
+				mrq->data->blksz * mrq->data->blocks;
+			unsigned int bytes_xfered;
+			u64 dat = 0;
+			int shift = 56;
+			/*
+			 * Copy data to the xmit buffer before
+			 * issuing the command
+			 */
+			sg_miter_start(smi, mrq->data->sg,
+				       mrq->data->sg_len, SG_MITER_FROM_SG);
+			/* Auto inc from offset zero, dbuf zero */
+			cvmx_write_csr(host->base + OCT_MIO_EMM_BUF_IDX,
+					0x10000ull);
+
+			for (bytes_xfered = 0; bytes_xfered < data_len;) {
+				if (smi->consumed >= smi->length) {
+					if (!sg_miter_next(smi))
+						break;
+					smi->consumed = 0;
+				}
+
+				while (smi->consumed < smi->length &&
+					shift >= 0) {
+
+					dat |= (u64)(((u8 *)(smi->addr))
+						[smi->consumed]) << shift;
+					bytes_xfered++;
+					smi->consumed++;
+					shift -= 8;
+				}
+				if (shift < 0) {
+					cvmx_write_csr(host->base +
+						OCT_MIO_EMM_BUF_DAT, dat);
+					shift = 56;
+					dat = 0;
+				}
+			}
+			sg_miter_stop(smi);
+		}
+		if (cmd->data->timeout_ns) {
+			cvmx_write_csr(host->base + OCT_MIO_EMM_WDOG,
+				octeon_mmc_timeout_to_wdog(slot,
+					cmd->data->timeout_ns));
+			octeon_mmc_dbg("OCT_MIO_EMM_WDOG %llu\n",
+				       cvmx_read_csr(host->base +
+						OCT_MIO_EMM_WDOG));
+		}
+	} else {
+		cvmx_write_csr(host->base + OCT_MIO_EMM_WDOG,
+			       ((u64)slot->clock * 850ull) / 1000ull);
+		octeon_mmc_dbg("OCT_MIO_EMM_WDOG %llu\n",
+			       cvmx_read_csr(host->base + OCT_MIO_EMM_WDOG));
+	}
+	/* Clear the bit. */
+	cvmx_write_csr(host->base + OCT_MIO_EMM_INT, emm_int.u64);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_INT_EN, emm_int.u64);
+	host->dma_active = false;
+
+	emm_cmd.u64 = 0;
+	emm_cmd.s.cmd_val = 1;
+	emm_cmd.s.ctype_xor = mods.ctype_xor;
+	emm_cmd.s.rtype_xor = mods.rtype_xor;
+	if (mmc_cmd_type(cmd) == MMC_CMD_ADTC)
+		emm_cmd.s.offset = 64 -
+			((cmd->data->blksz * cmd->data->blocks) / 8);
+	emm_cmd.s.bus_id = slot->bus_id;
+	emm_cmd.s.cmd_idx = cmd->opcode;
+	emm_cmd.s.arg = cmd->arg;
+	cvmx_write_csr(host->base + OCT_MIO_EMM_STS_MASK, 0);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_CMD, emm_cmd.u64);
+	octeon_mmc_dbg("MIO_EMM_CMD: %llx\n", emm_cmd.u64);
+}
+
+static void octeon_mmc_reset_bus(struct octeon_mmc_slot *slot)
+{
+	union cvmx_mio_emm_cfg emm_cfg;
+	union cvmx_mio_emm_switch emm_switch;
+	u64 wdog = 0;
+
+	emm_cfg.u64 = cvmx_read_csr(slot->host->base + OCT_MIO_EMM_CFG);
+	emm_switch.u64 = cvmx_read_csr(slot->host->base + OCT_MIO_EMM_SWITCH);
+	wdog = cvmx_read_csr(slot->host->base + OCT_MIO_EMM_WDOG);
+
+	emm_switch.s.switch_exe = 0;
+	emm_switch.s.switch_err0 = 0;
+	emm_switch.s.switch_err1 = 0;
+	emm_switch.s.switch_err2 = 0;
+	emm_switch.s.bus_id = 0;
+	cvmx_write_csr(slot->host->base + OCT_MIO_EMM_SWITCH, emm_switch.u64);
+	emm_switch.s.bus_id = slot->bus_id;
+	cvmx_write_csr(slot->host->base + OCT_MIO_EMM_SWITCH, emm_switch.u64);
+
+	slot->cached_switch = emm_switch.u64;
+
+	msleep(20);
+
+	cvmx_write_csr(slot->host->base + OCT_MIO_EMM_WDOG, wdog);
+}
+
+static void octeon_mmc_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
+{
+	struct octeon_mmc_slot	*slot;
+	struct octeon_mmc_host	*host;
+	int bus_width;
+	int clock;
+	bool ddr_clock;
+	int hs_timing;
+	int power_class = 10;
+	int clk_period;
+	int timeout = 2000;
+	union cvmx_mio_emm_switch emm_switch;
+	union cvmx_mio_emm_rsp_sts emm_sts;
+
+	slot = mmc_priv(mmc);
+	host = slot->host;
+
+	/* Only a single user of the bootbus at a time. */
+	octeon_mmc_acquire_bus(host);
+
+	octeon_mmc_switch_to(slot);
+
+	octeon_mmc_dbg("Calling set_ios: slot: clk = 0x%x, bus_width = %d\n",
+		       slot->clock, (mmc->caps & MMC_CAP_8_BIT_DATA) ? 8 :
+		       (mmc->caps & MMC_CAP_4_BIT_DATA) ? 4 : 1);
+	octeon_mmc_dbg("Calling set_ios: ios: clk = 0x%x, vdd = %u, bus_width = %u, power_mode = %u, timing = %u\n",
+		       ios->clock, ios->vdd, ios->bus_width, ios->power_mode,
+		       ios->timing);
+	octeon_mmc_dbg("Calling set_ios: mmc: caps = 0x%x, bus_width = %d\n",
+		       mmc->caps, mmc->ios.bus_width);
+
+	/*
+	 * Reset the chip on each power off
+	 */
+	if (ios->power_mode == MMC_POWER_OFF) {
+		octeon_mmc_reset_bus(slot);
+		if (!IS_ERR(mmc->supply.vmmc))
+			regulator_disable(mmc->supply.vmmc);
+		else /* Legacy power GPIO */
+			gpiod_set_value_cansleep(slot->pwr_gpiod, 0);
+	} else {
+		if (!IS_ERR(mmc->supply.vmmc))
+			regulator_enable(mmc->supply.vmmc);
+		else /* Legacy power GPIO */
+			gpiod_set_value_cansleep(slot->pwr_gpiod, 1);
+	}
+
+	switch (ios->bus_width) {
+	case MMC_BUS_WIDTH_8:
+		bus_width = 2;
+		break;
+	case MMC_BUS_WIDTH_4:
+		bus_width = 1;
+		break;
+	case MMC_BUS_WIDTH_1:
+		bus_width = 0;
+		break;
+	default:
+		octeon_mmc_dbg("unknown bus width %d\n", ios->bus_width);
+		bus_width = 0;
+		break;
+	}
+
+	hs_timing = (ios->timing == MMC_TIMING_MMC_HS);
+	ddr_clock = (bus_width && ios->timing >= MMC_TIMING_UHS_DDR50);
+
+	if (ddr_clock)
+		bus_width |= 4;
+
+	if (ios->clock) {
+		slot->clock = ios->clock;
+
+		clock = slot->clock;
+
+		if (clock > 52000000)
+			clock = 52000000;
+
+		clk_period = (octeon_get_io_clock_rate() + clock - 1) /
+			(2 * clock);
+
+		/* until clock-renengotiate-on-CRC is in */
+		if (ddr_clock && ddr > 1)
+			clk_period *= 2;
+
+		emm_switch.u64 = 0;
+		emm_switch.s.hs_timing = hs_timing;
+		emm_switch.s.bus_width = bus_width;
+		emm_switch.s.power_class = power_class;
+		emm_switch.s.clk_hi = clk_period;
+		emm_switch.s.clk_lo = clk_period;
+
+		if (!octeon_mmc_switch_val_changed(slot, emm_switch.u64)) {
+			octeon_mmc_dbg("No change from 0x%llx mio_emm_switch, returning.\n",
+				       emm_switch.u64);
+			goto out;
+		}
+
+		octeon_mmc_dbg("Writing 0x%llx to mio_emm_wdog\n",
+			       ((u64)clock * 850ull) / 1000ull);
+		cvmx_write_csr(host->base + OCT_MIO_EMM_WDOG,
+			       ((u64)clock * 850ull) / 1000ull);
+		octeon_mmc_dbg("Writing 0x%llx to mio_emm_switch\n",
+				emm_switch.u64);
+
+		cvmx_write_csr(host->base + OCT_MIO_EMM_SWITCH, emm_switch.u64);
+		emm_switch.s.bus_id = slot->bus_id;
+		cvmx_write_csr(host->base + OCT_MIO_EMM_SWITCH, emm_switch.u64);
+		slot->cached_switch = emm_switch.u64;
+
+		do {
+			emm_sts.u64 =
+				cvmx_read_csr(host->base + OCT_MIO_EMM_RSP_STS);
+			if (!emm_sts.s.switch_val)
+				break;
+			udelay(100);
+		} while (timeout-- > 0);
+
+		if (timeout <= 0) {
+			octeon_mmc_dbg("switch command timed out, status=0x%llx\n",
+				       emm_sts.u64);
+			goto out;
+		}
+	}
+out:
+	octeon_mmc_release_bus(host);
+}
+
+static const struct mmc_host_ops octeon_mmc_ops = {
+	.request        = octeon_mmc_request,
+	.set_ios        = octeon_mmc_set_ios,
+	.get_ro		= mmc_gpio_get_ro,
+	.get_cd		= mmc_gpio_get_cd,
+};
+
+static void octeon_mmc_set_clock(struct octeon_mmc_slot *slot,
+				 unsigned int clock)
+{
+	struct mmc_host *mmc = slot->mmc;
+
+	clock = min(clock, mmc->f_max);
+	clock = max(clock, mmc->f_min);
+	slot->clock = clock;
+}
+
+static int octeon_mmc_initlowlevel(struct octeon_mmc_slot *slot)
+{
+	union cvmx_mio_emm_switch emm_switch;
+	struct octeon_mmc_host *host = slot->host;
+
+	host->emm_cfg |= 1ull << slot->bus_id;
+	cvmx_write_csr(slot->host->base + OCT_MIO_EMM_CFG, host->emm_cfg);
+	octeon_mmc_set_clock(slot, 400000);
+
+	/* Program initial clock speed and power */
+	emm_switch.u64 = 0;
+	emm_switch.s.power_class = 10;
+	emm_switch.s.clk_hi = (slot->sclock / slot->clock) / 2;
+	emm_switch.s.clk_lo = (slot->sclock / slot->clock) / 2;
+
+	cvmx_write_csr(host->base + OCT_MIO_EMM_SWITCH, emm_switch.u64);
+	emm_switch.s.bus_id = slot->bus_id;
+	cvmx_write_csr(host->base + OCT_MIO_EMM_SWITCH, emm_switch.u64);
+	slot->cached_switch = emm_switch.u64;
+
+	cvmx_write_csr(host->base + OCT_MIO_EMM_WDOG,
+		       ((u64)slot->clock * 850ull) / 1000ull);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_STS_MASK, 0xe4f90080ull);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_RCA, 1);
+	return 0;
+}
+
+static int octeon_mmc_of_copy_legacy_u32(struct device_node *node,
+					  const char *legacy_name,
+					  const char *new_name)
+{
+	u32 value;
+	int ret;
+
+	ret = of_property_read_u32(node, legacy_name, &value);
+	if (!ret) {
+		/* Found legacy - set generic property */
+		struct property *new_p;
+		u32 *new_v;
+
+		pr_info(FW_WARN "%s: Using legacy DT property '%s'.\n",
+			node->full_name, legacy_name);
+
+		new_p = kzalloc(sizeof(*new_p), GFP_KERNEL);
+		new_v = kzalloc(sizeof(u32), GFP_KERNEL);
+		if (!new_p || !new_v)
+			return -ENOMEM;
+
+		*new_v = value;
+		new_p->name = kstrdup(new_name, GFP_KERNEL);
+		new_p->length = sizeof(u32);
+		new_p->value = new_v;
+
+		of_update_property(node, new_p);
+	}
+	return 0;
+}
+
+/*
+ * This function parses the legacy device tree that may be found in devices
+ * shipped before the driver was upstreamed. Future devices should not require
+ * it as standard bindings should be used
+ */
+static int octeon_mmc_of_parse_legacy(struct device *dev,
+				      struct device_node *node,
+				      struct octeon_mmc_slot *slot)
+{
+	int ret;
+
+	ret = octeon_mmc_of_copy_legacy_u32(node, "cavium,bus-max-width",
+					    "bus-width");
+	if (ret)
+		return ret;
+
+	ret = octeon_mmc_of_copy_legacy_u32(node, "spi-max-frequency",
+					    "max-frequency");
+	if (ret)
+		return ret;
+
+	slot->pwr_gpiod = devm_gpiod_get_optional(dev, "power", GPIOD_OUT_LOW);
+	if (!IS_ERR(slot->pwr_gpiod)) {
+		pr_info(FW_WARN "%s: Using legacy DT property '%s'.\n",
+			node->full_name, "gpios-power");
+	}
+
+	return 0;
+}
+
+static int octeon_mmc_slot_probe(struct platform_device *slot_pdev,
+				 struct octeon_mmc_host *host)
+{
+	struct mmc_host *mmc;
+	struct octeon_mmc_slot *slot;
+	struct device *dev = &slot_pdev->dev;
+	struct device_node *node = slot_pdev->dev.of_node;
+	u32 id, cmd_skew, dat_skew;
+	u64 clock_period;
+	int ret;
+
+	ret = of_property_read_u32(node, "reg", &id);
+	if (ret) {
+		dev_err(dev, "Missing or invalid reg property on %s\n",
+			of_node_full_name(node));
+		return ret;
+	}
+
+	if (id >= OCTEON_MAX_MMC || host->slot[id]) {
+		dev_err(dev, "Invalid reg property on %s\n",
+			of_node_full_name(node));
+		return -EINVAL;
+	}
+
+	mmc = mmc_alloc_host(sizeof(struct octeon_mmc_slot), dev);
+	if (!mmc) {
+		dev_err(dev, "alloc host failed\n");
+		return -ENOMEM;
+	}
+
+	slot = mmc_priv(mmc);
+	slot->mmc = mmc;
+	slot->host = host;
+
+	/* Convert legacy DT entries into things mmc_of_parse can understand */
+	ret = octeon_mmc_of_parse_legacy(dev, node, slot);
+	if (ret)
+		return ret;
+
+	ret = mmc_of_parse(mmc);
+	if (ret) {
+		dev_err(dev, "Failed to parse DT\n");
+		return ret;
+	}
+
+	/* Get regulators and the supported OCR mask */
+	ret = mmc_regulator_get_supply(mmc);
+	if (ret == -EPROBE_DEFER)
+		goto err;
+
+	/* Octeon specific DT properties */
+	ret = of_property_read_u32(node, "cavium,cmd-clk-skew", &cmd_skew);
+	if (ret)
+		cmd_skew = 0;
+
+	ret = of_property_read_u32(node, "cavium,dat-clk-skew", &dat_skew);
+	if (ret)
+		dat_skew = 0;
+
+	/*
+	 * Set up host parameters.
+	 */
+	mmc->ops = &octeon_mmc_ops;
+	mmc->f_min = 400000;
+	if (!mmc->f_max) {
+		mmc->f_max = 52000000;
+		dev_info(dev, "No max-frequency for slot %u, defaulting to %u\n",
+			id, mmc->f_max);
+	}
+
+	mmc->caps |= MMC_CAP_MMC_HIGHSPEED | MMC_CAP_SD_HIGHSPEED |
+		    MMC_CAP_ERASE;
+	mmc->ocr_avail = MMC_VDD_27_28 | MMC_VDD_28_29 | MMC_VDD_29_30 |
+			 MMC_VDD_30_31 | MMC_VDD_31_32 | MMC_VDD_32_33 |
+			 MMC_VDD_33_34 | MMC_VDD_34_35 | MMC_VDD_35_36;
+
+	/* post-sdk23 caps */
+	mmc->caps |=
+		((mmc->f_max >= 12000000) * MMC_CAP_UHS_SDR12) |
+		((mmc->f_max >= 25000000) * MMC_CAP_UHS_SDR25) |
+		((mmc->f_max >= 50000000) * MMC_CAP_UHS_SDR50) |
+		MMC_CAP_CMD23;
+
+	if ((!IS_ERR(mmc->supply.vmmc)) || (slot->pwr_gpiod))
+		mmc->caps |= MMC_CAP_POWER_OFF_CARD;
+
+	/* "1.8v" capability is actually 1.8-or-3.3v */
+	if (ddr)
+		mmc->caps |= MMC_CAP_UHS_DDR50 | MMC_CAP_1_8V_DDR;
+
+	mmc->max_segs = 64;
+	mmc->max_seg_size = host->linear_buf_size;
+	mmc->max_req_size = host->linear_buf_size;
+	mmc->max_blk_size = 512;
+	mmc->max_blk_count = mmc->max_req_size / 512;
+
+	slot->clock = mmc->f_min;
+	slot->sclock = octeon_get_io_clock_rate();
+
+	clock_period = 1000000000000ull / slot->sclock; /* period in pS */
+	slot->cmd_cnt = (cmd_skew + clock_period / 2) / clock_period;
+	slot->dat_cnt = (dat_skew + clock_period / 2) / clock_period;
+
+	slot->bus_id = id;
+	slot->cached_rca = 1;
+
+	/* Only a single user of the bootbus at a time. */
+	octeon_mmc_acquire_bus(host);
+	host->slot[id] = slot;
+
+	octeon_mmc_switch_to(slot);
+	/* Initialize MMC Block. */
+	octeon_mmc_initlowlevel(slot);
+
+	octeon_mmc_release_bus(host);
+
+	ret = mmc_add_host(mmc);
+	if (ret) {
+		dev_err(dev, "mmc_add_host() returned %d\n", ret);
+		goto err;
+	}
+
+	return 0;
+
+err:
+	slot->host->slot[id] = NULL;
+
+	gpiod_set_value_cansleep(slot->pwr_gpiod, 0);
+
+	mmc_free_host(slot->mmc);
+	return ret;
+}
+
+static int octeon_mmc_slot_remove(struct octeon_mmc_slot *slot)
+{
+	mmc_remove_host(slot->mmc);
+
+	slot->host->slot[slot->bus_id] = NULL;
+
+	gpiod_set_value_cansleep(slot->pwr_gpiod, 0);
+
+	mmc_free_host(slot->mmc);
+
+	return 0;
+}
+
+static int octeon_mmc_probe(struct platform_device *pdev)
+{
+	struct octeon_mmc_host *host;
+	struct resource	*res;
+	void __iomem *base;
+	int mmc_irq[9];
+	int i;
+	int ret = 0;
+	struct device_node *node = pdev->dev.of_node;
+	struct device_node *cn;
+	bool cn78xx_style;
+	u64 t;
+
+	host = devm_kzalloc(&pdev->dev, sizeof(*host), GFP_KERNEL);
+	if (!host)
+		return -ENOMEM;
+
+	spin_lock_init(&host->irq_handler_lock);
+	sema_init(&host->mmc_serializer, 1);
+
+	cn78xx_style = of_device_is_compatible(node, "cavium,octeon-7890-mmc");
+	if (cn78xx_style) {
+		host->need_bootbus_lock = false;
+		host->big_dma_addr = true;
+		host->need_irq_handler_lock = true;
+		/*
+		 * First seven are the EMM_INT bits 0..6, then two for
+		 * the EMM_DMA_INT bits
+		 */
+		for (i = 0; i < 9; i++) {
+			mmc_irq[i] = platform_get_irq(pdev, i);
+			if (mmc_irq[i] < 0)
+				return mmc_irq[i];
+		}
+	} else {
+		host->need_bootbus_lock = true;
+		host->big_dma_addr = false;
+		host->need_irq_handler_lock = false;
+		/* First one is EMM second NDF_DMA */
+		for (i = 0; i < 2; i++) {
+			mmc_irq[i] = platform_get_irq(pdev, i);
+			if (mmc_irq[i] < 0)
+				return mmc_irq[i];
+		}
+	}
+	host->last_slot = -1;
+
+	if (bb_size < 512 || bb_size >= (1 << 24))
+		bb_size = 1 << 18;
+	host->linear_buf_size = bb_size;
+	host->linear_buf = devm_kzalloc(&pdev->dev, host->linear_buf_size,
+					GFP_KERNEL);
+
+	if (!host->linear_buf) {
+		dev_err(&pdev->dev, "devm_kzalloc failed\n");
+		return -ENOMEM;
+	}
+
+	host->pdev = pdev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(&pdev->dev, "Platform resource[0] is missing\n");
+		return -ENXIO;
+	}
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+	host->base = (__force u64)base;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+	if (!res) {
+		dev_err(&pdev->dev, "Platform resource[1] is missing\n");
+		return -EINVAL;
+	}
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+	host->ndf_base = (__force u64)base;
+	/*
+	 * Clear out any pending interrupts that may be left over from
+	 * bootloader.
+	 */
+	t = cvmx_read_csr(host->base + OCT_MIO_EMM_INT);
+	cvmx_write_csr(host->base + OCT_MIO_EMM_INT, t);
+	if (cn78xx_style) {
+		/* Only CMD_DONE, DMA_DONE, CMD_ERR, DMA_ERR */
+		for (i = 1; i <= 4; i++) {
+			ret = devm_request_irq(&pdev->dev, mmc_irq[i],
+					       octeon_mmc_interrupt,
+					       0, DRV_NAME, host);
+			if (ret < 0) {
+				dev_err(&pdev->dev, "Error: devm_request_irq %d\n",
+					mmc_irq[i]);
+				return ret;
+			}
+		}
+	} else {
+		ret = devm_request_irq(&pdev->dev, mmc_irq[0],
+				       octeon_mmc_interrupt, 0, DRV_NAME, host);
+		if (ret < 0) {
+			dev_err(&pdev->dev, "Error: devm_request_irq %d\n",
+				mmc_irq[0]);
+			return ret;
+		}
+	}
+
+	host->global_pwr_gpiod = devm_gpiod_get_optional(&pdev->dev, "power",
+								GPIOD_OUT_HIGH);
+	if (IS_ERR(host->global_pwr_gpiod)) {
+		dev_err(&host->pdev->dev, "Invalid POWER GPIO\n");
+		return PTR_ERR(host->global_pwr_gpiod);
+	}
+
+	platform_set_drvdata(pdev, host);
+
+	for_each_child_of_node(node, cn) {
+		struct platform_device *slot_pdev;
+
+		slot_pdev = of_platform_device_create(cn, NULL, &pdev->dev);
+		ret = octeon_mmc_slot_probe(slot_pdev, host);
+		if (ret) {
+			dev_err(&host->pdev->dev, "Error populating slots\n");
+			gpiod_set_value_cansleep(host->global_pwr_gpiod, 0);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int octeon_mmc_remove(struct platform_device *pdev)
+{
+	union cvmx_mio_ndf_dma_cfg ndf_dma_cfg;
+	struct octeon_mmc_host *host = platform_get_drvdata(pdev);
+	int i;
+
+	for (i = 0; i < OCTEON_MAX_MMC; i++) {
+		if (host->slot[i])
+			octeon_mmc_slot_remove(host->slot[i]);
+	}
+
+	ndf_dma_cfg.u64 = cvmx_read_csr(host->ndf_base + OCT_MIO_NDF_DMA_CFG);
+	ndf_dma_cfg.s.en = 0;
+	cvmx_write_csr(host->ndf_base + OCT_MIO_NDF_DMA_CFG, ndf_dma_cfg.u64);
+
+	gpiod_set_value_cansleep(host->global_pwr_gpiod, 0);
+
+	return 0;
+}
+
+static const struct of_device_id octeon_mmc_match[] = {
+	{
+		.compatible = "cavium,octeon-6130-mmc",
+	},
+	{
+		.compatible = "cavium,octeon-7890-mmc",
+	},
+	{},
+};
+MODULE_DEVICE_TABLE(of, octeon_mmc_match);
+
+static struct platform_driver octeon_mmc_driver = {
+	.probe		= octeon_mmc_probe,
+	.remove		= octeon_mmc_remove,
+	.driver		= {
+		.name	= DRV_NAME,
+		.of_match_table = octeon_mmc_match,
+	},
+};
+
+static int __init octeon_mmc_init(void)
+{
+	return platform_driver_register(&octeon_mmc_driver);
+}
+
+static void __exit octeon_mmc_cleanup(void)
+{
+	platform_driver_unregister(&octeon_mmc_driver);
+}
+
+module_init(octeon_mmc_init);
+module_exit(octeon_mmc_cleanup);
+
+MODULE_AUTHOR("Cavium Inc. <support@cavium.com>");
+MODULE_DESCRIPTION("low-level driver for Cavium OCTEON MMC/SSD card");
+MODULE_LICENSE("GPL");
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-03-31 15:26 [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller Matt Redfearn
  2016-03-31 15:26 ` [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver " Matt Redfearn
@ 2016-04-14 12:45 ` Ulf Hansson
  2016-04-18  8:53   ` Matt Redfearn
  1 sibling, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-14 12:45 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Rob Herring, Ralf Baechle

+ Rob and Ralf

On 31 March 2016 at 17:26, Matt Redfearn <matt.redfearn@imgtec.com> wrote:
> From: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
>
> Add Device Tree binding document for Octeon MMC controller. The driver
> is in a following patch.
>
> The MMC controller can be connected to up to 4 "slots" which may be
> eMMC, MMC or SD card devices. As there is a single controller, each
> available slot is represented as a child node of the controller.
>
> This is a similar concept to the atmel-mci driver.
>
> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
> Signed-off-by: Chandrakala Chavva <cchavva@caviumnetworks.com>
> Signed-off-by: David Daney <david.daney@cavium.com>
> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com>
> Signed-off-by: Leonid Rosenboim <lrosenboim@caviumnetworks.com>
> Signed-off-by: Peter Swain <pswain@cavium.com>
> Signed-off-by: Aaron Williams <aaron.williams@cavium.com>
> Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
> Acked-by: Rob Herring <robh@kernel.org>
> ---
> v7: No changes
>
> v6:
> - Split up patch
> - Moved device tree fixup to platform code
> ---
>  .../devicetree/bindings/mmc/octeon-mmc.txt         | 79 ++++++++++++++++++++++
>  1 file changed, 79 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mmc/octeon-mmc.txt
>
> diff --git a/Documentation/devicetree/bindings/mmc/octeon-mmc.txt b/Documentation/devicetree/bindings/mmc/octeon-mmc.txt
> new file mode 100644
> index 000000000000..d2c576d9ad65
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mmc/octeon-mmc.txt
> @@ -0,0 +1,79 @@
> +* OCTEON SD/MMC Host Controller
> +
> +This controller is present on some members of the Cavium OCTEON SoC
> +family, provide an interface for eMMC, MMC and SD devices.  There is a
> +single controller that may have several "slots" connected.  These
> +slots appear as children of the main controller node.
> +The DMA engine is an integral part of the controller block.
> +
> +1) MMC node
> +
> +Required properties:
> +- compatible : Should be "cavium,octeon-6130-mmc" or "cavium,octeon-7890-mmc"
> +- reg : Two entries:
> +       1) The base address of the MMC controller register bank.
> +       2) The base address of the MMC DMA engine register bank.
> +- interrupts :
> +       For "cavium,octeon-6130-mmc": two entries:
> +       1) The MMC controller interrupt line.
> +       2) The MMC DMA engine interrupt line.
> +       For "cavium,octeon-7890-mmc": nine entries:
> +       1) The next block transfer of a multiblock transfer has completed (BUF_DONE)
> +       2) Operation completed successfully (CMD_DONE).
> +       3) DMA transfer completed successfully (DMA_DONE).
> +       4) Operation encountered an error (CMD_ERR).
> +       5) DMA transfer encountered an error (DMA_ERR).
> +       6) Switch operation completed successfully (SWITCH_DONE).
> +       7) Switch operation encountered an error (SWITCH_ERR).
> +       8) Internal DMA engine request completion interrupt (DONE).
> +       9) Internal DMA FIFO underflow (FIFO).
> +- #address-cells : Must be <1>
> +- #size-cells : Must be <0>
> +
> +The node contains child nodes for each slot that the platform uses.
> +
> +Example:
> +mmc@1180000002000 {
> +       compatible = "cavium,octeon-6130-mmc";
> +       reg = <0x11800 0x00002000 0x0 0x100>,
> +               <0x11800 0x00000168 0x0 0x20>;
> +       #address-cells = <1>;
> +       #size-cells = <0>;
> +       /* EMM irq, DMA irq */
> +       interrupts = <1 19>, <0 63>;
> +
> +       [ child node definitions...]
> +};
> +
> +
> +2) Slot nodes
> +Properties in mmc.txt apply to each slot node that the platform uses.
> +
> +Required properties:
> +- reg : The slot number.
> +
> +Optional properties:
> +- cavium,cmd-clk-skew : the amount of delay (in pS) past the clock edge
> +       to sample the command pin.
> +- cavium,dat-clk-skew : the amount of delay (in pS) past the clock edge
> +       to sample the data pin.
> +
> +Example:
> +       mmc@1180000002000 {
> +               compatible = "cavium,octeon-6130-mmc";
> +               reg = <0x11800 0x00002000 0x0 0x100>,
> +                     <0x11800 0x00000168 0x0 0x20>;
> +               #address-cells = <1>;
> +               #size-cells = <0>;
> +               /* EMM irq, DMA irq */
> +               interrupts = <1 19>, <0 63>;
> +
> +               mmc-slot@0 {
> +                       reg = <0>;

Let me elaborate on how child nodes are being used by the MMC subsystem today.

1)
When a child node has "reg = <0>", the mmc core treat this as being
represented by a non-removable/embedded card.
We have also added a compatible string, "mmc-card", that help us to
deal with constraints for eMMC cards during initialization. You may
have a look at Documentation/devicetree/bindings/mmc/mmc-card.txt to
know more about this.

Additionally, the dynamically created struct device representing the
card, gets its device node pointer assigned to the child node when
"reg = <0>". In that way, the bus/drivers for the card can use it as
well.

2)
When "reg" is non-zero and it's an SDIO card that has been detected,
we treat the child node as representing the embedded SDIO functional
device. The maximum amount of SDIO functional devices that is
supported is dynamic, as it's encoded inside registers of the SDIO
card. The "reg" number needs to be within this range, which allows the
dynamically created struct device for the SDIO func device, to get its
device node assigned to its corresponding child node.

Typically an SDIO function device may be a WLAN chip, which thus is
being attached to an SDIO interface. This is especially needed when
the WLAN driver requires platform specific resources, like some GPIOs
or wake IRQs. An example is available in
arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts, where mmc3 use this.

> +                       max-frequency = <20000000>;
> +                       bus-width = <8>;
> +                       vmmc-supply = <&reg_vmmc3>;
> +                       cd-gpios = <&gpio 9 0>;
> +                       wp-gpios = <&gpio 10 0>;
> +               };
> +       };
> --
> 2.5.0
>

So to summarize, the mmc core has a different view of what child nodes
represents than what is being suggested here. I am not sure it's good
idea to add/change that interpretation.

Perhaps it's better to implement the machine specific patching of the
DTS and then remove the "mmc-slot". I think something along those
lines was proposed in one of the earlier versions as well!?

What do you think?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-14 12:45 ` [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings " Ulf Hansson
@ 2016-04-18  8:53   ` Matt Redfearn
  2016-04-18 11:13     ` Ulf Hansson
  0 siblings, 1 reply; 32+ messages in thread
From: Matt Redfearn @ 2016-04-18  8:53 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Rob Herring, Ralf Baechle

On 14/04/16 13:45, Ulf Hansson wrote:

+ Rob and Ralf

On 31 March 2016 at 17:26, Matt Redfearn <matt.redfearn@imgtec.com><mailto:matt.redfearn@imgtec.com> wrote:


From: Aleksey Makarov <aleksey.makarov@caviumnetworks.com><mailto:aleksey.makarov@caviumnetworks.com>

Add Device Tree binding document for Octeon MMC controller. The driver
is in a following patch.

The MMC controller can be connected to up to 4 "slots" which may be
eMMC, MMC or SD card devices. As there is a single controller, each
available slot is represented as a child node of the controller.

This is a similar concept to the atmel-mci driver.

Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi><mailto:aaro.koskinen@iki.fi>
Signed-off-by: Chandrakala Chavva <cchavva@caviumnetworks.com><mailto:cchavva@caviumnetworks.com>
Signed-off-by: David Daney <david.daney@cavium.com><mailto:david.daney@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com><mailto:aleksey.makarov@auriga.com>
Signed-off-by: Leonid Rosenboim <lrosenboim@caviumnetworks.com><mailto:lrosenboim@caviumnetworks.com>
Signed-off-by: Peter Swain <pswain@cavium.com><mailto:pswain@cavium.com>
Signed-off-by: Aaron Williams <aaron.williams@cavium.com><mailto:aaron.williams@cavium.com>
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com><mailto:matt.redfearn@imgtec.com>
Acked-by: Rob Herring <robh@kernel.org><mailto:robh@kernel.org>
---
v7: No changes

v6:
- Split up patch
- Moved device tree fixup to platform code
---
 .../devicetree/bindings/mmc/octeon-mmc.txt         | 79 ++++++++++++++++++++++
 1 file changed, 79 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mmc/octeon-mmc.txt

diff --git a/Documentation/devicetree/bindings/mmc/octeon-mmc.txt b/Documentation/devicetree/bindings/mmc/octeon-mmc.txt
new file mode 100644
index 000000000000..d2c576d9ad65
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/octeon-mmc.txt
@@ -0,0 +1,79 @@
+* OCTEON SD/MMC Host Controller
+
+This controller is present on some members of the Cavium OCTEON SoC
+family, provide an interface for eMMC, MMC and SD devices.  There is a
+single controller that may have several "slots" connected.  These
+slots appear as children of the main controller node.
+The DMA engine is an integral part of the controller block.
+
+1) MMC node
+
+Required properties:
+- compatible : Should be "cavium,octeon-6130-mmc" or "cavium,octeon-7890-mmc"
+- reg : Two entries:
+       1) The base address of the MMC controller register bank.
+       2) The base address of the MMC DMA engine register bank.
+- interrupts :
+       For "cavium,octeon-6130-mmc": two entries:
+       1) The MMC controller interrupt line.
+       2) The MMC DMA engine interrupt line.
+       For "cavium,octeon-7890-mmc": nine entries:
+       1) The next block transfer of a multiblock transfer has completed (BUF_DONE)
+       2) Operation completed successfully (CMD_DONE).
+       3) DMA transfer completed successfully (DMA_DONE).
+       4) Operation encountered an error (CMD_ERR).
+       5) DMA transfer encountered an error (DMA_ERR).
+       6) Switch operation completed successfully (SWITCH_DONE).
+       7) Switch operation encountered an error (SWITCH_ERR).
+       8) Internal DMA engine request completion interrupt (DONE).
+       9) Internal DMA FIFO underflow (FIFO).
+- #address-cells : Must be <1>
+- #size-cells : Must be <0>
+
+The node contains child nodes for each slot that the platform uses.
+
+Example:
+mmc@1180000002000 {
+       compatible = "cavium,octeon-6130-mmc";
+       reg = <0x11800 0x00002000 0x0 0x100>,
+               <0x11800 0x00000168 0x0 0x20>;
+       #address-cells = <1>;
+       #size-cells = <0>;
+       /* EMM irq, DMA irq */
+       interrupts = <1 19>, <0 63>;
+
+       [ child node definitions...]
+};
+
+
+2) Slot nodes
+Properties in mmc.txt apply to each slot node that the platform uses.
+
+Required properties:
+- reg : The slot number.
+
+Optional properties:
+- cavium,cmd-clk-skew : the amount of delay (in pS) past the clock edge
+       to sample the command pin.
+- cavium,dat-clk-skew : the amount of delay (in pS) past the clock edge
+       to sample the data pin.
+
+Example:
+       mmc@1180000002000 {
+               compatible = "cavium,octeon-6130-mmc";
+               reg = <0x11800 0x00002000 0x0 0x100>,
+                     <0x11800 0x00000168 0x0 0x20>;
+               #address-cells = <1>;
+               #size-cells = <0>;
+               /* EMM irq, DMA irq */
+               interrupts = <1 19>, <0 63>;
+
+               mmc-slot@0 {
+                       reg = <0>;


Let me elaborate on how child nodes are being used by the MMC subsystem today.

1)
When a child node has "reg = <0>", the mmc core treat this as being
represented by a non-removable/embedded card.
We have also added a compatible string, "mmc-card", that help us to
deal with constraints for eMMC cards during initialization. You may
have a look at Documentation/devicetree/bindings/mmc/mmc-card.txt to
know more about this.

Additionally, the dynamically created struct device representing the
card, gets its device node pointer assigned to the child node when
"reg = <0>". In that way, the bus/drivers for the card can use it as
well.

2)
When "reg" is non-zero and it's an SDIO card that has been detected,
we treat the child node as representing the embedded SDIO functional
device. The maximum amount of SDIO functional devices that is
supported is dynamic, as it's encoded inside registers of the SDIO
card. The "reg" number needs to be within this range, which allows the
dynamically created struct device for the SDIO func device, to get its
device node assigned to its corresponding child node.

Typically an SDIO function device may be a WLAN chip, which thus is
being attached to an SDIO interface. This is especially needed when
the WLAN driver requires platform specific resources, like some GPIOs
or wake IRQs. An example is available in
arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts, where mmc3 use this.



+                       max-frequency = <20000000>;
+                       bus-width = <8>;
+                       vmmc-supply = <&reg_vmmc3>;
+                       cd-gpios = <&gpio 9 0>;
+                       wp-gpios = <&gpio 10 0>;
+               };
+       };
--
2.5.0



So to summarize, the mmc core has a different view of what child nodes
represents than what is being suggested here. I am not sure it's good
idea to add/change that interpretation.

Perhaps it's better to implement the machine specific patching of the
DTS and then remove the "mmc-slot". I think something along those
lines was proposed in one of the earlier versions as well!?

What do you think?

Kind regards
Uffe


Hi Ulf,

Thanks for taking the time to reply and your detailed answer. I see why the bindings currently don't fit into the mmc core model. However, we still have the issue that these bindings are shipping in hardware and we can't change them. As far as I can see, ther major difference between these bindings and what is upstream, is the interpretation of the reg property, and the compatible string of what is connected to the controller. The Octeon MMC controller supports connections to up to 4 devices, either eMMC devices or SD card slots. I think it will be difficult to impossible to programatically interpret the device tree supplied by the bootloader to classify the slots as SD card / eMMC such that the necessary reg / compatible strings can be added.

If we were to remove the slots entirely from the device tree, how would the core deal with having one controller with multiple devices attached? Obviously we need some locking of the shared controller between the child devices.

Thanks,
Matt


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-18  8:53   ` Matt Redfearn
@ 2016-04-18 11:13     ` Ulf Hansson
  2016-04-18 11:37       ` Matt Redfearn
  0 siblings, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-18 11:13 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Rob Herring, Ralf Baechle

[...]

> +
> +2) Slot nodes
> +Properties in mmc.txt apply to each slot node that the platform uses.
> +
> +Required properties:
> +- reg : The slot number.
> +
> +Optional properties:
> +- cavium,cmd-clk-skew : the amount of delay (in pS) past the clock edge
> +       to sample the command pin.
> +- cavium,dat-clk-skew : the amount of delay (in pS) past the clock edge
> +       to sample the data pin.
> +
> +Example:
> +       mmc@1180000002000 {
> +               compatible = "cavium,octeon-6130-mmc";
> +               reg = <0x11800 0x00002000 0x0 0x100>,
> +                     <0x11800 0x00000168 0x0 0x20>;
> +               #address-cells = <1>;
> +               #size-cells = <0>;
> +               /* EMM irq, DMA irq */
> +               interrupts = <1 19>, <0 63>;
> +
> +               mmc-slot@0 {
> +                       reg = <0>;
>
>
> Let me elaborate on how child nodes are being used by the MMC subsystem today.
>
> 1)
> When a child node has "reg = <0>", the mmc core treat this as being
> represented by a non-removable/embedded card.
> We have also added a compatible string, "mmc-card", that help us to
> deal with constraints for eMMC cards during initialization. You may
> have a look at Documentation/devicetree/bindings/mmc/mmc-card.txt to
> know more about this.
>
> Additionally, the dynamically created struct device representing the
> card, gets its device node pointer assigned to the child node when
> "reg = <0>". In that way, the bus/drivers for the card can use it as
> well.
>
> 2)
> When "reg" is non-zero and it's an SDIO card that has been detected,
> we treat the child node as representing the embedded SDIO functional
> device. The maximum amount of SDIO functional devices that is
> supported is dynamic, as it's encoded inside registers of the SDIO
> card. The "reg" number needs to be within this range, which allows the
> dynamically created struct device for the SDIO func device, to get its
> device node assigned to its corresponding child node.
>
> Typically an SDIO function device may be a WLAN chip, which thus is
> being attached to an SDIO interface. This is especially needed when
> the WLAN driver requires platform specific resources, like some GPIOs
> or wake IRQs. An example is available in
> arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts, where mmc3 use this.
>
>
>
> +                       max-frequency = <20000000>;
> +                       bus-width = <8>;
> +                       vmmc-supply = <&reg_vmmc3>;
> +                       cd-gpios = <&gpio 9 0>;
> +                       wp-gpios = <&gpio 10 0>;
> +               };
> +       };
> --
> 2.5.0
>
>
>
> So to summarize, the mmc core has a different view of what child nodes
> represents than what is being suggested here. I am not sure it's good
> idea to add/change that interpretation.
>
> Perhaps it's better to implement the machine specific patching of the
> DTS and then remove the "mmc-slot". I think something along those
> lines was proposed in one of the earlier versions as well!?
>
> What do you think?
>
> Kind regards
> Uffe
>
>
> Hi Ulf,
>
> Thanks for taking the time to reply and your detailed answer. I see why the bindings currently don't fit into the mmc core model. However, we still have the issue that these bindings are shipping in hardware and we can't change them. As far as I can see, ther major difference between these bindings and what is upstream, is the interpretation of the reg property, and the compatible string of what is connected to the controller. The Octeon MMC controller supports connections to up to 4 devices, either eMMC devices or SD card slots. I think it will be difficult to impossible to programatically interpret the device tree supplied by the bootloader to classify the slots as SD card / eMMC such that the necessary reg / compatible strings can be added.

>
> If we were to remove the slots entirely from the device tree, how would the core deal with having one controller with multiple devices attached? Obviously we need some locking of the shared controller between the child devices.

Currently the core doesn't deal with multiple card slots and I doubt
we ever will be adding that. The reason is simply because, in practice
there don't exist such configurations, even if some HWs supports it. I
assume that's the case here as well, right!?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-18 11:13     ` Ulf Hansson
@ 2016-04-18 11:37       ` Matt Redfearn
  2016-04-18 12:08         ` Ulf Hansson
  0 siblings, 1 reply; 32+ messages in thread
From: Matt Redfearn @ 2016-04-18 11:37 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Rob Herring, Ralf Baechle


________________________________________
From: Ulf Hansson [ulf.hansson@linaro.org]
Sent: 18 April 2016 12:13
To: Matt Redfearn
Cc: linux-mmc; Aleksey Makarov; Chandrakala Chavva; David Daney; Aleksey Makarov; Leonid Rosenboim; Peter Swain; Aaron Williams; Rob Herring; Ralf Baechle
Subject: Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller

[...]

> +
> +2) Slot nodes
> +Properties in mmc.txt apply to each slot node that the platform uses.
> +
> +Required properties:
> +- reg : The slot number.
> +
> +Optional properties:
> +- cavium,cmd-clk-skew : the amount of delay (in pS) past the clock edge
> +       to sample the command pin.
> +- cavium,dat-clk-skew : the amount of delay (in pS) past the clock edge
> +       to sample the data pin.
> +
> +Example:
> +       mmc@1180000002000 {
> +               compatible = "cavium,octeon-6130-mmc";
> +               reg = <0x11800 0x00002000 0x0 0x100>,
> +                     <0x11800 0x00000168 0x0 0x20>;
> +               #address-cells = <1>;
> +               #size-cells = <0>;
> +               /* EMM irq, DMA irq */
> +               interrupts = <1 19>, <0 63>;
> +
> +               mmc-slot@0 {
> +                       reg = <0>;
>
>
> Let me elaborate on how child nodes are being used by the MMC subsystem today.
>
> 1)
> When a child node has "reg = <0>", the mmc core treat this as being
> represented by a non-removable/embedded card.
> We have also added a compatible string, "mmc-card", that help us to
> deal with constraints for eMMC cards during initialization. You may
> have a look at Documentation/devicetree/bindings/mmc/mmc-card.txt to
> know more about this.
>
> Additionally, the dynamically created struct device representing the
> card, gets its device node pointer assigned to the child node when
> "reg = <0>". In that way, the bus/drivers for the card can use it as
> well.
>
> 2)
> When "reg" is non-zero and it's an SDIO card that has been detected,
> we treat the child node as representing the embedded SDIO functional
> device. The maximum amount of SDIO functional devices that is
> supported is dynamic, as it's encoded inside registers of the SDIO
> card. The "reg" number needs to be within this range, which allows the
> dynamically created struct device for the SDIO func device, to get its
> device node assigned to its corresponding child node.
>
> Typically an SDIO function device may be a WLAN chip, which thus is
> being attached to an SDIO interface. This is especially needed when
> the WLAN driver requires platform specific resources, like some GPIOs
> or wake IRQs. An example is available in
> arch/arm/boot/dts/sun7i-a20-wits-pro-a20-dkt.dts, where mmc3 use this.
>
>
>
> +                       max-frequency = <20000000>;
> +                       bus-width = <8>;
> +                       vmmc-supply = <&reg_vmmc3>;
> +                       cd-gpios = <&gpio 9 0>;
> +                       wp-gpios = <&gpio 10 0>;
> +               };
> +       };
> --
> 2.5.0
>
>
>
> So to summarize, the mmc core has a different view of what child nodes
> represents than what is being suggested here. I am not sure it's good
> idea to add/change that interpretation.
>
> Perhaps it's better to implement the machine specific patching of the
> DTS and then remove the "mmc-slot". I think something along those
> lines was proposed in one of the earlier versions as well!?
>
> What do you think?
>
> Kind regards
> Uffe
>
>
> Hi Ulf,
>
> Thanks for taking the time to reply and your detailed answer. I see why the bindings currently don't fit into the mmc core model. However, we still have the issue that these bindings are shipping in hardware and we can't change them. As far as I can see, ther major difference between these bindings and what is upstream, is the interpretation of the reg property, and the compatible string of what is connected to the controller. The Octeon MMC controller supports connections to up to 4 devices, either eMMC devices or SD card slots. I think it will be difficult to impossible to programatically interpret the device tree supplied by the bootloader to classify the slots as SD card / eMMC such that the necessary reg / compatible strings can be added.

>
> If we were to remove the slots entirely from the device tree, how would the core deal with having one controller with multiple devices attached? Obviously we need some locking of the shared controller between the child devices.

Currently the core doesn't deal with multiple card slots and I doubt
we ever will be adding that. The reason is simply because, in practice
there don't exist such configurations, even if some HWs supports it. I
assume that's the case here as well, right!?

Kind regards
Uffe

Hi Ulf,
Unfortunately not, for example the Rhinolabs UTM8 board which we have several of here (http://www.rhinolabsinc.com/rhino-sdna7130-networking-appliance/) contains an eMMC device attached to "slot 1" and a microSD card slot on "slot 2". The driver as it stands supports this board by registering an mmc host for each slot, and it is possible to use both devices while shaing the controller resources.

Thanks,
Matt

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-18 11:37       ` Matt Redfearn
@ 2016-04-18 12:08         ` Ulf Hansson
  2016-04-18 12:57           ` Matt Redfearn
  0 siblings, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-18 12:08 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Rob Herring, Ralf Baechle

[...]

>> Thanks for taking the time to reply and your detailed answer. I see why the bindings currently don't fit into the mmc core model. However, we still have the issue that these bindings are shipping in hardware and we can't change them. As far as I can see, ther major difference between these bindings and what is upstream, is the interpretation of the reg property, and the compatible string of what is connected to the controller. The Octeon MMC controller supports connections to up to 4 devices, either eMMC devices or SD card slots. I think it will be difficult to impossible to programatically interpret the device tree supplied by the bootloader to classify the slots as SD card / eMMC such that the necessary reg / compatible strings can be added.
>
>>
>> If we were to remove the slots entirely from the device tree, how would the core deal with having one controller with multiple devices attached? Obviously we need some locking of the shared controller between the child devices.
>
> Currently the core doesn't deal with multiple card slots and I doubt
> we ever will be adding that. The reason is simply because, in practice
> there don't exist such configurations, even if some HWs supports it. I
> assume that's the case here as well, right!?
>
> Kind regards
> Uffe
>
> Hi Ulf,
> Unfortunately not, for example the Rhinolabs UTM8 board which we have several of here (http://www.rhinolabsinc.com/rhino-sdna7130-networking-appliance/) contains an eMMC device attached to "slot 1" and a microSD card slot on "slot 2". The driver as it stands supports this board by registering an mmc host for each slot, and it is possible to use both devices while shaing the controller resources.

I see! Are you sure you don't need any additional out of tree patch
for the mmc core/block as well?

Moreover, of course these configurations suffers from poor
performance, but I guess that's irrelevant for these systems!?

Perhaps we can add specific DT property which tells about whether the
childnode is a slot? Then you will have to patch your DTB during boot
and add that information.

Or you have another suggestion?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-18 12:08         ` Ulf Hansson
@ 2016-04-18 12:57           ` Matt Redfearn
  2016-04-18 22:59             ` David Daney
  2016-04-19  9:15             ` Ulf Hansson
  0 siblings, 2 replies; 32+ messages in thread
From: Matt Redfearn @ 2016-04-18 12:57 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, David Daney,
	Aleksey Makarov, Leonid Rosenboim, Peter Swain, Aaron Williams,
	Rob Herring, Ralf Baechle

HI Ulf,

On 18/04/16 13:08, Ulf Hansson wrote:
> [...]
>
>>> Thanks for taking the time to reply and your detailed answer. I see why the bindings currently don't fit into the mmc core model. However, we still have the issue that these bindings are shipping in hardware and we can't change them. As far as I can see, ther major difference between these bindings and what is upstream, is the interpretation of the reg property, and the compatible string of what is connected to the controller. The Octeon MMC controller supports connections to up to 4 devices, either eMMC devices or SD card slots. I think it will be difficult to impossible to programatically interpret the device tree supplied by the bootloader to classify the slots as SD card / eMMC such that the necessary reg / compatible strings can be added.
>>> If we were to remove the slots entirely from the device tree, how would the core deal with having one controller with multiple devices attached? Obviously we need some locking of the shared controller between the child devices.
>> Currently the core doesn't deal with multiple card slots and I doubt
>> we ever will be adding that. The reason is simply because, in practice
>> there don't exist such configurations, even if some HWs supports it. I
>> assume that's the case here as well, right!?
>>
>> Kind regards
>> Uffe
>>
>> Hi Ulf,
>> Unfortunately not, for example the Rhinolabs UTM8 board which we have several of here (http://www.rhinolabsinc.com/rhino-sdna7130-networking-appliance/) contains an eMMC device attached to "slot 1" and a microSD card slot on "slot 2". The driver as it stands supports this board by registering an mmc host for each slot, and it is possible to use both devices while shaing the controller resources.
> I see! Are you sure you don't need any additional out of tree patch
> for the mmc core/block as well?

No, the driver applies to and works with unpatched upstream mmc core. It 
create one mmc_host struct per slot and registers it though 
mmc_add_host. Sharing the controller resources between slots is handled 
within the driver.

>
> Moreover, of course these configurations suffers from poor
> performance, but I guess that's irrelevant for these systems!?

Indeed, the eMMC and SD are not typically used together, and if they 
are, there's an explainable hardware performance limitation.

>
> Perhaps we can add specific DT property which tells about whether the
> childnode is a slot? Then you will have to patch your DTB during boot
> and add that information.

The DTB in currently shipping devices has a compatible property 
"cavium,octeon-6130-mmc-slot" for each slot. Perhaps that would do? Or, 
we could add a property such as "slot-num" rather than encoding this 
information in the reg property?

Thanks,
Matt

>
> Or you have another suggestion?
>
> Kind regards
> Uffe


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-18 12:57           ` Matt Redfearn
@ 2016-04-18 22:59             ` David Daney
  2016-04-19  9:15             ` Ulf Hansson
  1 sibling, 0 replies; 32+ messages in thread
From: David Daney @ 2016-04-18 22:59 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: Ulf Hansson, linux-mmc, Aleksey Makarov, Chandrakala Chavva,
	David Daney, Aleksey Makarov, Leonid Rosenboim, Peter Swain,
	Aaron Williams, Rob Herring, Ralf Baechle

On 04/18/2016 05:57 AM, Matt Redfearn wrote:
> HI Ulf,
>
> On 18/04/16 13:08, Ulf Hansson wrote:
>> [...]
>>
>>>> Thanks for taking the time to reply and your detailed answer. I see why the bindings currently don't fit into the mmc core model. However, we still have the issue that these bindings are shipping in hardware and we can't change them. As far as I can see, ther major difference between these bindings and what is upstream, is the interpretation of the reg property, and the compatible string of what is connected to the controller. The Octeon MMC controller supports connections to up to 4 devices, either eMMC devices or SD card slots. I think it will be difficult to impossible to programatically interpret the device tree supplied by the bootloader to classify the slots as SD card / eMMC such that the necessary reg / compatible strings can be added.
>>>> If we were to remove the slots entirely from the device tree, how would the core deal with having one controller with multiple devices attached? Obviously we need some locking of the shared controller between the child devices.
>>> Currently the core doesn't deal with multiple card slots and I doubt
>>> we ever will be adding that. The reason is simply because, in practice
>>> there don't exist such configurations, even if some HWs supports it. I
>>> assume that's the case here as well, right!?
>>>
>>> Kind regards
>>> Uffe
>>>
>>> Hi Ulf,
>>> Unfortunately not, for example the Rhinolabs UTM8 board which we have several of here (http://www.rhinolabsinc.com/rhino-sdna7130-networking-appliance/) contains an eMMC device attached to "slot 1" and a microSD card slot on "slot 2". The driver as it stands supports this board by registering an mmc host for each slot, and it is possible to use both devices while shaing the controller resources.
>> I see! Are you sure you don't need any additional out of tree patch
>> for the mmc core/block as well?
>
> No, the driver applies to and works with unpatched upstream mmc core. It
> create one mmc_host struct per slot and registers it though
> mmc_add_host. Sharing the controller resources between slots is handled
> within the driver.
>
>>
>> Moreover, of course these configurations suffers from poor
>> performance, but I guess that's irrelevant for these systems!?
>
> Indeed, the eMMC and SD are not typically used together, and if they
> are, there's an explainable hardware performance limitation.
>

None of this matters as the hardware cannot be reconfigured or changed. 
  For better or worse, we have a single MMC bus controller that is 
multiplexed between four "slots", and commands between the various slots 
must be serialized to ensure there is no intra-slot contention of resources.

It actually gets even better...  On some systems, the MMC "slots" can 
contend with non MMC devices and the scope of the serialization is even 
wider (see the octeon_bootbus_sem in the driver).

>>
>> Perhaps we can add specific DT property which tells about whether the
>> childnode is a slot? Then you will have to patch your DTB during boot
>> and add that information.
>
> The DTB in currently shipping devices has a compatible property
> "cavium,octeon-6130-mmc-slot" for each slot. Perhaps that would do? Or,
> we could add a property such as "slot-num" rather than encoding this
> information in the reg property?

The most useful course forward would be to use the already widely 
deployed device tree binding that is implemented in the posted driver. 
Since the device tree is it supplied by the boot firmware of many 
commercially available devices (Ubiquti ER-8, Rhinolabs UTM8, etc.), 
this would allow the thing to function without jumping through hoops 
with a device-tree du jour at the whim of kernel device tree ABI tweakers.

Since the concept of an MMC "slot" is foreign to the kernel's MMC core, 
there is nothing to be gained by giving it a non-vendor specific 
property name like "slot-num".  By continuing to use 
"cavium,octeon-6130-mmc-slot", we denote that it is vendor specific 
thing for this weird hardware.

David Daney

>
> Thanks,
> Matt
>
>>
>> Or you have another suggestion?
>>
>> Kind regards
>> Uffe
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-18 12:57           ` Matt Redfearn
  2016-04-18 22:59             ` David Daney
@ 2016-04-19  9:15             ` Ulf Hansson
  2016-04-19 16:13               ` David Daney
  1 sibling, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-19  9:15 UTC (permalink / raw)
  To: Matt Redfearn, David Daney
  Cc: linux-mmc, Aleksey Makarov, Chandrakala Chavva, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams, Rob Herring,
	Ralf Baechle

[...]

>>> Unfortunately not, for example the Rhinolabs UTM8 board which we have several of here (http://www.rhinolabsinc.com/rhino-sdna7130-networking-appliance/) contains an eMMC device attached to "slot 1" and a microSD card slot on "slot 2". The driver as it stands supports this board by registering an mmc host for each slot, and it is possible to use both devices while shaing the controller resources.
>> I see! Are you sure you don't need any additional out of tree patch
>> for the mmc core/block as well?
>
> No, the driver applies to and works with unpatched upstream mmc core. It
> create one mmc_host struct per slot and registers it though
> mmc_add_host. Sharing the controller resources between slots is handled
> within the driver.

Okay, thanks for confirming that.

>
>>
>> Moreover, of course these configurations suffers from poor
>> performance, but I guess that's irrelevant for these systems!?
>
> Indeed, the eMMC and SD are not typically used together, and if they
> are, there's an explainable hardware performance limitation.

I guess there's even a performance penalty involved when they are not
used together, because of the locking overhead, right?

Perhaps that could be fixed by a clever locking implementation, unless
the overhead is negligible!?

>
>>
>> Perhaps we can add specific DT property which tells about whether the
>> childnode is a slot? Then you will have to patch your DTB during boot
>> and add that information.
>
> The DTB in currently shipping devices has a compatible property
> "cavium,octeon-6130-mmc-slot" for each slot. Perhaps that would do? Or,
> we could add a property such as "slot-num" rather than encoding this
> information in the reg property?

>From this discussion I have understood that we clearly need a way to
describe mmc slots in DT!

Unfortunate we should have discussed this before you decided to ship
devices with DTBs containing non accepted DT bindings, but hey better
late than never! :-)

Anyway, I am happy with describing mmc-slots in child nodes as you
proposed, but with one exception. I want a common mmc compatible
string telling that the child node is an mmc slot. Perhaps something
like "mmc-slot" could work or if you have another suggestion.

If such compatible isn't found in a child node from the host device
node, the mmc core shall continue to treat the child node as the
card/SDIO-func. Then we need to extend the mmc core to know about slot
nodes and parse for their child nodes, as those will then represent
the card/SDIO-func.

Do you think this could work?

Also, I did a quick review of the mmc octeon driver in patch 2/2 [1]
and found that there are several legacy DT bindings being parsed, but
which isn't documented in $subject patch. They should, and of course
they must be marked as legacy/deprecated. Let's go through an example
of a slot node from your already deployed DTBs. I have pasted an
example from [1] here.


        mmc-slot@2 {
                compatible = "cavium,octeon-6130-mmc-slot";

Convert to the common mmc slot compatible, whatever we decide to name it.

                reg = <2>;

This is fine, as the way to describe the slot number.

                voltage-ranges = <3300 3300>;

Regarding voltage ranges and power-gpios below, this shall be
converted to a GPIO regulator. In that way, the "voltage range" (in
mmc terminology OCR mask) will be fetched by the mmc core through the
mmc_regulator_get_supply() API as it becomes information about the
regulator.

                spi-max-frequency = <26000000>;

As you already done in [1], convert to "max-frequency"

                /* Power on GPIO 8, active high */
                /* power-gpios = <&gpio 8 0>; */
                power-gpios = <&gpio 8 1>;

See comment about voltage-ranges above.

        /*      spi-max-frequency = <52000000>; */
                /* bus width can be 1, 4 or 8 */
                cavium,bus-max-width = <8>;

As you already done in [1], convert to "bus-width"

        };

Mainly because of the changes for voltage range and power gpios, I
think you shall patch the DTB from arch/SoC specific code instead of
from the mmc octeon driver.

You need to describe a GPIO regulator in DT and then update the slot
node to contain a vmmc-supply handle to it.
I understand this may be a bit tricky, but to be clear - I am not
going to accept the voltage-range/power-gpios binding.

Kind regards
Uffe

[1]
http://www.spinics.net/lists/linux-mmc/msg36018.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-19  9:15             ` Ulf Hansson
@ 2016-04-19 16:13               ` David Daney
  2016-04-19 19:33                 ` Ulf Hansson
  0 siblings, 1 reply; 32+ messages in thread
From: David Daney @ 2016-04-19 16:13 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Matt Redfearn, David Daney, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, Aleksey Makarov, Leonid Rosenboim,
	Peter Swain, Aaron Williams, Rob Herring, Ralf Baechle

On 04/19/2016 02:15 AM, Ulf Hansson wrote:
[...]
>
>  From this discussion I have understood that we clearly need a way to
> describe mmc slots in DT!
>
> Unfortunate we should have discussed this before you decided to ship
> devices with DTBs containing non accepted DT bindings, but hey better
> late than never! :-)

This is a point where the DT and mmc maintainers have a continual 
misunderstanding of the facts.

There were requests to discuss the binding as far back as 2012:

https://www.linux-mips.org/archives/linux-mips/2012-05/msg00119.html

They were met with silence.

The firmware containing these bindings is out in the wild.  If we 
deprecate some of the bindings, the driver will still have to support 
them in the future.

In the case of OCTEON based devices, the device tree bindings are a 
firmware <--> kernel ABI as the device trees always come from the 
firmware and not some other place.

David Daney

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-19 16:13               ` David Daney
@ 2016-04-19 19:33                 ` Ulf Hansson
  2016-04-19 20:25                   ` David Daney
  0 siblings, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-19 19:33 UTC (permalink / raw)
  To: David Daney
  Cc: Matt Redfearn, David Daney, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, Aleksey Makarov, Leonid Rosenboim,
	Peter Swain, Aaron Williams, Rob Herring, Ralf Baechle

On 19 April 2016 at 18:13, David Daney <ddaney@caviumnetworks.com> wrote:
> On 04/19/2016 02:15 AM, Ulf Hansson wrote:
> [...]
>>
>>
>>  From this discussion I have understood that we clearly need a way to
>> describe mmc slots in DT!
>>
>> Unfortunate we should have discussed this before you decided to ship
>> devices with DTBs containing non accepted DT bindings, but hey better
>> late than never! :-)
>
>
> This is a point where the DT and mmc maintainers have a continual
> misunderstanding of the facts.
>
> There were requests to discuss the binding as far back as 2012:
>
> https://www.linux-mips.org/archives/linux-mips/2012-05/msg00119.html
>
> They were met with silence.

I am aware of that. I was not the maintainer of MMC back then, if I
where I would probably said the same thing as of today.

Moreover, for sure you could have been more persistent trying to get
peoples attention before you decided to deploy the DTB.

>
> The firmware containing these bindings is out in the wild.  If we deprecate
> some of the bindings, the driver will still have to support them in the
> future.
>
> In the case of OCTEON based devices, the device tree bindings are a firmware
> <--> kernel ABI as the device trees always come from the firmware and not
> some other place.

Now, I am not going spend more time arguing, instead I prefer if we
can be constructive.

As the maintainer of MMC, I have tried to be helpful by providing you
with my view on how we can move forward.

I don't think it's a big deal for you to implement something along
those lines for what I have requested, or is it?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-19 19:33                 ` Ulf Hansson
@ 2016-04-19 20:25                   ` David Daney
  2016-04-19 20:56                     ` Arnd Bergmann
  2016-04-20  9:32                     ` Ulf Hansson
  0 siblings, 2 replies; 32+ messages in thread
From: David Daney @ 2016-04-19 20:25 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Matt Redfearn, David Daney, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, Aleksey Makarov, Leonid Rosenboim,
	Peter Swain, Aaron Williams, Rob Herring, Ralf Baechle

On 04/19/2016 12:33 PM, Ulf Hansson wrote:
> On 19 April 2016 at 18:13, David Daney <ddaney@caviumnetworks.com> wrote:
>> On 04/19/2016 02:15 AM, Ulf Hansson wrote:
>> [...]
>>>
>>>
>>>   From this discussion I have understood that we clearly need a way to
>>> describe mmc slots in DT!
>>>
>>> Unfortunate we should have discussed this before you decided to ship
>>> devices with DTBs containing non accepted DT bindings, but hey better
>>> late than never! :-)
>>
>>
>> This is a point where the DT and mmc maintainers have a continual
>> misunderstanding of the facts.
>>
>> There were requests to discuss the binding as far back as 2012:
>>
>> https://www.linux-mips.org/archives/linux-mips/2012-05/msg00119.html
>>
>> They were met with silence.
>
> I am aware of that. I was not the maintainer of MMC back then, if I
> where I would probably said the same thing as of today.
>
> Moreover, for sure you could have been more persistent trying to get
> peoples attention before you decided to deploy the DTB.

We cannot change the past.  Our only concern should be to develop the 
simplest and cleanest overall implementation possible given the facts as 
they exist today.

>
>>
>> The firmware containing these bindings is out in the wild.  If we deprecate
>> some of the bindings, the driver will still have to support them in the
>> future.
>>
>> In the case of OCTEON based devices, the device tree bindings are a firmware
>> <--> kernel ABI as the device trees always come from the firmware and not
>> some other place.
>
> Now, I am not going spend more time arguing, instead I prefer if we
> can be constructive.
>
> As the maintainer of MMC, I have tried to be helpful by providing you
> with my view on how we can move forward.
>
> I don't think it's a big deal for you to implement something along
> those lines for what I have requested, or is it?

It is a matter of how much manipulation of the device tree we want to do 
before it is handed off to the driver core for device creation.  I would 
like to not do any.

There is a global cost to changing the device tree in early boot.  It 
may not be borne by the MMC sub-system, but it exists none the less.

Given that:

  A) The MMC core doesn't contain the concept of one bus controller with 
multiple "slots".

  B) The existing OCTEON device tree bindings should continue to be 
supported.

There are several options.

   1) Invent new MMC device tree bindings that are different from what 
we currently have, but that convey the same information.

   1a) Change the OCTEON MMC driver to use only these new bindings, and 
write some sort of device tree fix up code that runs in early boot to 
rewrite the device tree MMC properties.   This results in the code 
supporting the OCTEON MMC devices being split between the driver and 
system early boot code.  The cost is an increase in system complexity to 
generate the device tree nodes with new bindings.

   1b) Change the OCTEON MMC driver to use either these new bindings or 
legacy bindings.  In this case all OCTEON MMC code is localized to a 
single driver file.  There is a small increase in complexity to carry 
code to decode both new and legacy device tree bindings.

   2) Use existing OCTEON MMC device tree bindings, as they are 
sufficient to achieve a working driver.  Since the code is all specific 
to the OCTEON MMC driver, any ugliness is contained lexicographically 
near to the features being implemented.  Any feedback related to the 
architecture and style of the driver code would be addressed.

The current state is #2.  My interpretation of your desires is #1a.

I am fine with introducing a new device tree binding.  But, I don't 
think the kernel as a whole nor this specific OCTEON MMC driver will be 
improved by adding more device tree synthesis code in early boot.  Any 
thing that is there should be limited to supporting very old (pre OCTEON 
MMC) firmware that doesn't supply a device tree.  So I would strongly 
support either approach #1b or #2.

David Daney


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-03-31 15:26 ` [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver " Matt Redfearn
@ 2016-04-19 20:46   ` Arnd Bergmann
  2016-04-19 21:45     ` David Daney
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-19 20:46 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: ulf.hansson, linux-mmc, Aleksey Makarov, Chandrakala Chavva,
	David Daney, Aleksey Makarov, Leonid Rosenboim, Peter Swain,
	Aaron Williams

On Thursday 31 March 2016 16:26:53 Matt Redfearn wrote:
> +struct octeon_mmc_host {
> +	u64	base;
> +	u64	ndf_base;
> +	u64	emm_cfg;
> +	u64	n_minus_one;  /* OCTEON II workaround location */
> +	int	last_slot;
> +
> +	struct semaphore mmc_serializer;

Please don't add any new semaphores to the kernel, use a mutex or
a completion instead.

> +#if 0
> +#define octeon_mmc_dbg trace_printk
> +#else
> +static inline void octeon_mmc_dbg(const char *s, ...) { }
> +#endif

Remove this and use dev_dbg() or pr_debug(), it does the same thing.

> +static irqreturn_t octeon_mmc_interrupt(int irq, void *dev_id)

This function is rather long, can you split it up a bit for
readability?

> +{
> +	struct octeon_mmc_host *host = dev_id;
> +	union cvmx_mio_emm_int emm_int;
> +	struct mmc_request	*req;
> +	bool host_done;
> +	union cvmx_mio_emm_rsp_sts rsp_sts;
> +	unsigned long flags = 0;
> +
> +	if (host->need_irq_handler_lock)
> +		spin_lock_irqsave(&host->irq_handler_lock, flags);
> +	else
> +		__acquire(&host->irq_handler_lock);

The locking seems odd, why do you only sometimes have to take the lock,
and why do you disable interrupts from within the irq handler?

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-19 20:25                   ` David Daney
@ 2016-04-19 20:56                     ` Arnd Bergmann
  2016-04-19 21:50                       ` David Daney
  2016-04-20  9:32                     ` Ulf Hansson
  1 sibling, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-19 20:56 UTC (permalink / raw)
  To: David Daney
  Cc: Ulf Hansson, Matt Redfearn, David Daney, linux-mmc,
	Aleksey Makarov, Chandrakala Chavva, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams, Rob Herring,
	Ralf Baechle

On Tuesday 19 April 2016 13:25:13 David Daney wrote:
> 
> There are several options.
> 
>    1) Invent new MMC device tree bindings that are different from what 
> we currently have, but that convey the same information.
> 
>    1a) Change the OCTEON MMC driver to use only these new bindings, and 
> write some sort of device tree fix up code that runs in early boot to 
> rewrite the device tree MMC properties.   This results in the code 
> supporting the OCTEON MMC devices being split between the driver and 
> system early boot code.  The cost is an increase in system complexity to 
> generate the device tree nodes with new bindings.
> 
>    1b) Change the OCTEON MMC driver to use either these new bindings or 
> legacy bindings.  In this case all OCTEON MMC code is localized to a 
> single driver file.  There is a small increase in complexity to carry 
> code to decode both new and legacy device tree bindings.
> 
>    2) Use existing OCTEON MMC device tree bindings, as they are 
> sufficient to achieve a working driver.  Since the code is all specific 
> to the OCTEON MMC driver, any ugliness is contained lexicographically 
> near to the features being implemented.  Any feedback related to the 
> architecture and style of the driver code would be addressed.
> 
> The current state is #2.  My interpretation of your desires is #1a.
> 
> I am fine with introducing a new device tree binding.  But, I don't 
> think the kernel as a whole nor this specific OCTEON MMC driver will be 
> improved by adding more device tree synthesis code in early boot.  Any 
> thing that is there should be limited to supporting very old (pre OCTEON 
> MMC) firmware that doesn't supply a device tree.  So I would strongly 
> support either approach #1b or #2.

My interpretation of the v7 patch is that it is much closer to 1b than
to 2. I think that is a reasonable approach. The fact that it does not
change the compatible strings of the child nodes seems ok to me too:
We can document the binding as requiring "mmc-slot" for the generic
mmc binding. This driver just doesn't look at the compatible string
for the child nodes, so it also doesn't have to change them.

I would also recommend documenting the properties that the firmware
implements in the binding document, simply listing them as "deprecated"
(or whichever word you want to use for describing them). However, I would
ask that you only allow the old-style properties for the compatible
strings that are used in the shipped DT blobs but require the use
of an updated compatible string for the new binding because the two
are not really compatible.

Given how rare the multi-slot controllers are (this being the first
known example), I think it would be good to keep that concept out of
the mmc subsystem and only do the minimum necessary documentation
for the generic binding -- conceptually we can simply treat a
cavium,octeon-6130-mmc device node as special kind of bus, and each
of its subnodes as the actual controller (this is what the driver does
anyway).

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-19 20:46   ` Arnd Bergmann
@ 2016-04-19 21:45     ` David Daney
  2016-04-19 22:09       ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: David Daney @ 2016-04-19 21:45 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Matt Redfearn, ulf.hansson, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On 04/19/2016 01:46 PM, Arnd Bergmann wrote:
> On Thursday 31 March 2016 16:26:53 Matt Redfearn wrote:
>> +struct octeon_mmc_host {
>> +	u64	base;
>> +	u64	ndf_base;
>> +	u64	emm_cfg;
>> +	u64	n_minus_one;  /* OCTEON II workaround location */
>> +	int	last_slot;
>> +
>> +	struct semaphore mmc_serializer;
>
> Please don't add any new semaphores to the kernel, use a mutex or
> a completion instead.

The last time I checked, a mutex could not be used from interrupt context.

Since we are in interrupt context and we really want mutex-like behavior 
here, it seems like a semaphore is just the thing we need.

I am not sure how completions would be of use, perhaps you could elaborate.

>
>> +#if 0
>> +#define octeon_mmc_dbg trace_printk
>> +#else
>> +static inline void octeon_mmc_dbg(const char *s, ...) { }
>> +#endif
>
> Remove this and use dev_dbg() or pr_debug(), it does the same thing.

It is not the same thing.  pr_debug has *way* more overhead than 
trace_printk has it also acquires locks that can cause system lockups to 
happen.  The driver doesn't work with pr_debug().

We could just remove this *and* all calls to octeon_mmc_dbg, but 
switching to pr_debug() is not an option.

>
>> +static irqreturn_t octeon_mmc_interrupt(int irq, void *dev_id)
>
> This function is rather long, can you split it up a bit for
> readability?
>
>> +{
>> +	struct octeon_mmc_host *host = dev_id;
>> +	union cvmx_mio_emm_int emm_int;
>> +	struct mmc_request	*req;
>> +	bool host_done;
>> +	union cvmx_mio_emm_rsp_sts rsp_sts;
>> +	unsigned long flags = 0;
>> +
>> +	if (host->need_irq_handler_lock)
>> +		spin_lock_irqsave(&host->irq_handler_lock, flags);
>> +	else
>> +		__acquire(&host->irq_handler_lock);
>
> The locking seems odd, why do you only sometimes have to take the lock,

In the cn78xx_style case there are multiple irqs with this handler.  in 
the !cn78xx_style case there is a single irq.

The multiple irq case is what we are protecting.  Without the spinlock, 
there are races between the handler threads of the several irqs that can 
fire.

> and why do you disable interrupts from within the irq handler?
>

That may be gratuitous, although in the threaded interrupt handler case 
it may be needed.  I guess that has to be investigated.

David Daney

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-19 20:56                     ` Arnd Bergmann
@ 2016-04-19 21:50                       ` David Daney
  0 siblings, 0 replies; 32+ messages in thread
From: David Daney @ 2016-04-19 21:50 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ulf Hansson, Matt Redfearn, David Daney, linux-mmc,
	Aleksey Makarov, Chandrakala Chavva, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams, Rob Herring,
	Ralf Baechle

On 04/19/2016 01:56 PM, Arnd Bergmann wrote:
> On Tuesday 19 April 2016 13:25:13 David Daney wrote:
>>
>> There are several options.
>>
>>     1) Invent new MMC device tree bindings that are different from what
>> we currently have, but that convey the same information.
>>
>>     1a) Change the OCTEON MMC driver to use only these new bindings, and
>> write some sort of device tree fix up code that runs in early boot to
>> rewrite the device tree MMC properties.   This results in the code
>> supporting the OCTEON MMC devices being split between the driver and
>> system early boot code.  The cost is an increase in system complexity to
>> generate the device tree nodes with new bindings.
>>
>>     1b) Change the OCTEON MMC driver to use either these new bindings or
>> legacy bindings.  In this case all OCTEON MMC code is localized to a
>> single driver file.  There is a small increase in complexity to carry
>> code to decode both new and legacy device tree bindings.
>>
>>     2) Use existing OCTEON MMC device tree bindings, as they are
>> sufficient to achieve a working driver.  Since the code is all specific
>> to the OCTEON MMC driver, any ugliness is contained lexicographically
>> near to the features being implemented.  Any feedback related to the
>> architecture and style of the driver code would be addressed.
>>
>> The current state is #2.  My interpretation of your desires is #1a.
>>
>> I am fine with introducing a new device tree binding.  But, I don't
>> think the kernel as a whole nor this specific OCTEON MMC driver will be
>> improved by adding more device tree synthesis code in early boot.  Any
>> thing that is there should be limited to supporting very old (pre OCTEON
>> MMC) firmware that doesn't supply a device tree.  So I would strongly
>> support either approach #1b or #2.
>
> My interpretation of the v7 patch is that it is much closer to 1b than
> to 2. I think that is a reasonable approach. The fact that it does not
> change the compatible strings of the child nodes seems ok to me too:
> We can document the binding as requiring "mmc-slot" for the generic
> mmc binding. This driver just doesn't look at the compatible string
> for the child nodes, so it also doesn't have to change them.
>
> I would also recommend documenting the properties that the firmware
> implements in the binding document, simply listing them as "deprecated"
> (or whichever word you want to use for describing them). However, I would
> ask that you only allow the old-style properties for the compatible
> strings that are used in the shipped DT blobs but require the use
> of an updated compatible string for the new binding because the two
> are not really compatible.
>
> Given how rare the multi-slot controllers are (this being the first
> known example), I think it would be good to keep that concept out of
> the mmc subsystem and only do the minimum necessary documentation
> for the generic binding -- conceptually we can simply treat a
> cavium,octeon-6130-mmc device node as special kind of bus, and each
> of its subnodes as the actual controller (this is what the driver does
> anyway).
>

Thanks Arnd,

This seems like a good path forward, and I agree that the device tree 
bindings documentation should be changed to note the deprecation of some 
of the properties.

David Daney


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-19 21:45     ` David Daney
@ 2016-04-19 22:09       ` Arnd Bergmann
  2016-04-19 23:27         ` David Daney
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-19 22:09 UTC (permalink / raw)
  To: David Daney
  Cc: Matt Redfearn, ulf.hansson, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Tuesday 19 April 2016 14:45:35 David Daney wrote:
> On 04/19/2016 01:46 PM, Arnd Bergmann wrote:
> > On Thursday 31 March 2016 16:26:53 Matt Redfearn wrote:
> >> +struct octeon_mmc_host {
> >> +	u64	base;
> >> +	u64	ndf_base;
> >> +	u64	emm_cfg;
> >> +	u64	n_minus_one;  /* OCTEON II workaround location */
> >> +	int	last_slot;
> >> +
> >> +	struct semaphore mmc_serializer;
> >
> > Please don't add any new semaphores to the kernel, use a mutex or
> > a completion instead.
> 
> The last time I checked, a mutex could not be used from interrupt context.
> 
> Since we are in interrupt context and we really want mutex-like behavior 
> here, it seems like a semaphore is just the thing we need.
> 
> I am not sure how completions would be of use, perhaps you could elaborate.

Completions are used when you have one thread waiting for an event,
which is often an interrupt: the process calls
wait_for_completion(&completion); and the interrupt handler calls
complete(&completion);

It seems that you are using the semaphore for two reasons here (I
only read it briefly so I may be wrong):
waiting for the interrupt handler and serializing against another
thread. In this case you need both a mutex (to guarantee mutual
exclusion) and a completion (to wait for the interrupt handler
to finish).

> >> +#if 0
> >> +#define octeon_mmc_dbg trace_printk
> >> +#else
> >> +static inline void octeon_mmc_dbg(const char *s, ...) { }
> >> +#endif
> >
> > Remove this and use dev_dbg() or pr_debug(), it does the same thing.
> 
> It is not the same thing.  pr_debug has *way* more overhead than 
> trace_printk has it also acquires locks that can cause system lockups to 
> happen.  The driver doesn't work with pr_debug().
> 
> We could just remove this *and* all calls to octeon_mmc_dbg, but 
> switching to pr_debug() is not an option.

Ok, I failed to realize that trace_printk() is a generic feature,
not something you defined here.

Anyway, pr_debug() does nothing when the 'DEBUG' macro is not defined
and CONFIG_DYNAMIC_DEBUG is not enabled, which is the normal case
(as your #if 0/#else is), so there is still zero overhead.

If the dynamic debug gets enabled, the overhead is may be higher
than trace_printk, but I don't think you could get into a case where
it hangs the system, printk() is intentionally callable from any
context except from the serial driver output and from non-maskable
interrupts.

> >> +static irqreturn_t octeon_mmc_interrupt(int irq, void *dev_id)
> >
> > This function is rather long, can you split it up a bit for
> > readability?
> >
> >> +{
> >> +	struct octeon_mmc_host *host = dev_id;
> >> +	union cvmx_mio_emm_int emm_int;
> >> +	struct mmc_request	*req;
> >> +	bool host_done;
> >> +	union cvmx_mio_emm_rsp_sts rsp_sts;
> >> +	unsigned long flags = 0;
> >> +
> >> +	if (host->need_irq_handler_lock)
> >> +		spin_lock_irqsave(&host->irq_handler_lock, flags);
> >> +	else
> >> +		__acquire(&host->irq_handler_lock);
> >
> > The locking seems odd, why do you only sometimes have to take the lock,
> 
> In the cn78xx_style case there are multiple irqs with this handler.  in 
> the !cn78xx_style case there is a single irq.
> 
> The multiple irq case is what we are protecting.  Without the spinlock, 
> there are races between the handler threads of the several irqs that can 
> fire.

Ok, I see. How about rearranging the code in a way that doesn't
need the check or the __acquire? You could do this as

static irqreturn_t octeon_mmc_multi_interrupt(int irq, void *dev_id)
{
	struct octeon_mmc_host *host = dev_id;
	irqreturn_t ret;

	spin_lock(&host->irq_handler_lock);
	ret = octeon_mmc_interrupt(irq, dev_id);
	spin_unlock(&host->irq_handler_lock);

	return ret;
}

> > and why do you disable interrupts from within the irq handler?
> >
> 
> That may be gratuitous, although in the threaded interrupt handler case 
> it may be needed.  I guess that has to be investigated.

Ok. I checked first that this is not a threaded handler. In case
of fully preemptable kernels, all interrupt handlers are threaded,
but then spin_lock_irqsave() does not turn off interrupts but only
prevent handlers from running.

	Arnd


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-19 22:09       ` Arnd Bergmann
@ 2016-04-19 23:27         ` David Daney
  2016-04-19 23:57           ` Arnd Bergmann
  2016-04-21  8:02           ` Ulf Hansson
  0 siblings, 2 replies; 32+ messages in thread
From: David Daney @ 2016-04-19 23:27 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Matt Redfearn, ulf.hansson, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On 04/19/2016 03:09 PM, Arnd Bergmann wrote:
> On Tuesday 19 April 2016 14:45:35 David Daney wrote:
>> On 04/19/2016 01:46 PM, Arnd Bergmann wrote:
>>> On Thursday 31 March 2016 16:26:53 Matt Redfearn wrote:
>>>> +struct octeon_mmc_host {
>>>> +	u64	base;
>>>> +	u64	ndf_base;
>>>> +	u64	emm_cfg;
>>>> +	u64	n_minus_one;  /* OCTEON II workaround location */
>>>> +	int	last_slot;
>>>> +
>>>> +	struct semaphore mmc_serializer;
>>>
>>> Please don't add any new semaphores to the kernel, use a mutex or
>>> a completion instead.
>>
>> The last time I checked, a mutex could not be used from interrupt context.
>>
>> Since we are in interrupt context and we really want mutex-like behavior
>> here, it seems like a semaphore is just the thing we need.
>>
>> I am not sure how completions would be of use, perhaps you could elaborate.
>
> Completions are used when you have one thread waiting for an event,
> which is often an interrupt: the process calls
> wait_for_completion(&completion); and the interrupt handler calls
> complete(&completion);
>
> It seems that you are using the semaphore for two reasons here (I
> only read it briefly so I may be wrong):
> waiting for the interrupt handler and serializing against another
> thread. In this case you need both a mutex (to guarantee mutual
> exclusion) and a completion (to wait for the interrupt handler
> to finish).
>

The way the MMC driver works is that the driver's .request() method is 
called to initiate a request.   After .request() is finished, it returns 
back to the kernel so other work can be done.

 From the interrupt handler, when the request is complete, the interrupt 
handler calls req->done(req); to terminate the whole thing.


   So we have:

   CPU-A                      CPU-B                  CPU-C

   octeon_mmc_request(0)        .                     .
      down()                    .                     .
      queue_request(0);         .                     .
      return;                   .                     .
   other_useful_work            .                     .
    .                           .                     .
    .                           .                     .
    .                           .                     .
   octeon_mmc_request(1)        .                     .
      down() -> blocks          .                     .
                             octeon_mmc_interrupt()   .
                                 up() -> unblocks     .
      down() <-unblocks          req->done(0)         .
      queue_request(1);          return;              .
      return;                   .                     .
   other_useful_work            .                     .
    .                           .                octeon_mmc_interrupt()
    .                           .                     up()
    .                           .                     req->done(1)
    .                           .                     return;
    .                           .                     .


We don't want to have the thread on CPU-A wait around in an extra mutex 
or completion for the command to finish.  The MMC core already has its 
own request waiting code, but it doesn't handle the concept of a slot. 
These commands can take hundreds or thousands of mS to terminate.  The 
whole idea of the MMC framework is to queue the request and get back to 
doing other work ASAP.

In the case of this octeon_mmc driver we need to serialize the commands 
issued to multiple slots, for this we use the semaphore.  If you don't 
like struct semaphore, we would have to invent a proprietary wait queue 
mechanism that has semantics nearly identical to struct semaphore, and 
people would complain that we are reinventing the semaphore.

It doesn't seem clean to cobble up multiple waiting structures 
(completion + mutex + logic that surely would contain errors) where a 
single (well debugged) struct semaphore does what we want.


David Daney



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-19 23:27         ` David Daney
@ 2016-04-19 23:57           ` Arnd Bergmann
  2016-04-20  0:02             ` Arnd Bergmann
  2016-04-21  8:02           ` Ulf Hansson
  1 sibling, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-19 23:57 UTC (permalink / raw)
  To: David Daney
  Cc: Matt Redfearn, ulf.hansson, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Tuesday 19 April 2016 16:27:03 David Daney wrote:
> The way the MMC driver works is that the driver's .request() method is 
> called to initiate a request.   After .request() is finished, it returns 
> back to the kernel so other work can be done.
> 
>  From the interrupt handler, when the request is complete, the interrupt 
> handler calls req->done(req); to terminate the whole thing.
> 
> 
>    So we have:
> 
>    CPU-A                      CPU-B                  CPU-C
> 
>    octeon_mmc_request(0)        .                     .
>       down()                    .                     .
>       queue_request(0);         .                     .
>       return;                   .                     .
>    other_useful_work            .                     .
>     .                           .                     .
>     .                           .                     .
>     .                           .                     .
>    octeon_mmc_request(1)        .                     .
>       down() -> blocks          .                     .
>                              octeon_mmc_interrupt()   .
>                                  up() -> unblocks     .
>       down() <-unblocks          req->done(0)         .
>       queue_request(1);          return;              .
>       return;                   .                     .
>    other_useful_work            .                     .
>     .                           .                octeon_mmc_interrupt()
>     .                           .                     up()
>     .                           .                     req->done(1)
>     .                           .                     return;
>     .                           .                     .
> 
> 
> We don't want to have the thread on CPU-A wait around in an extra mutex 
> or completion for the command to finish.  The MMC core already has its 
> own request waiting code, but it doesn't handle the concept of a slot. 
> These commands can take hundreds or thousands of mS to terminate.  The 
> whole idea of the MMC framework is to queue the request and get back to 
> doing other work ASAP.

Right. This usecase seems to be one of the few that legitimately use
semaphores. However, there has been a long effort to eliminate the
remaining semaphores from the kernel (mostly much stalled since 2.6.32,
but we still try to prevent new ones from creeping in).

I had at one point a patch series that removed all the remaining
semaphores, but didn't get it merged at the time, and now there
are a few dozen new users.

> In the case of this octeon_mmc driver we need to serialize the commands 
> issued to multiple slots, for this we use the semaphore.  If you don't 
> like struct semaphore, we would have to invent a proprietary wait queue 
> mechanism that has semantics nearly identical to struct semaphore, and 
> people would complain that we are reinventing the semaphore.
> 
> It doesn't seem clean to cobble up multiple waiting structures 
> (completion + mutex + logic that surely would contain errors) where a 
> single (well debugged) struct semaphore does what we want.

I think using a wait_event() call could make do this with basically
the same amount of code and by naming it right, this could be just
as expressive. For the usecase you describe, a single completion
structure would actually serve the purpose as well (no need for the
mutex), but it would be unusual to wait for the completion before
starting the work.

If you do this

	wait_event(&host->wq, !host->current_req);

in the request function as well as 

	wake_up(&host->wq);

after setting host->current_req to NULL, I think you end up with
the same level of readability as the semaphore, and slightly
better object code, and you can drop the 'WARN_ON(host->current_req);'
which would be known to be impossible.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-19 23:57           ` Arnd Bergmann
@ 2016-04-20  0:02             ` Arnd Bergmann
  0 siblings, 0 replies; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-20  0:02 UTC (permalink / raw)
  To: David Daney
  Cc: Matt Redfearn, ulf.hansson, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Wednesday 20 April 2016 01:57:08 Arnd Bergmann wrote:
> If you do this
> 
>         wait_event(&host->wq, !host->current_req);
> 
> in the request function as well as 
> 
>         wake_up(&host->wq);
> 
> after setting host->current_req to NULL, I think you end up with
> the same level of readability as the semaphore, and slightly
> better object code, and you can drop the 'WARN_ON(host->current_req);'
> which would be known to be impossible.
> 

Correction: you would still need some form of serialization to
ensure that only one thread can set the host->current_req
at any time, so it's slightly more complex, either using
a mutex around the wait_event/assignment, or doing

	wait_event(&host->wq, !cmpxchg(host->current_req, NULL, mrq));

which probably wants to be encapsulated in an inline function
or annotated with a comment.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-19 20:25                   ` David Daney
  2016-04-19 20:56                     ` Arnd Bergmann
@ 2016-04-20  9:32                     ` Ulf Hansson
  2016-04-20 22:32                       ` David Daney
  1 sibling, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-20  9:32 UTC (permalink / raw)
  To: David Daney
  Cc: Matt Redfearn, David Daney, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, Aleksey Makarov, Leonid Rosenboim,
	Peter Swain, Aaron Williams, Rob Herring, Ralf Baechle

[...]

>
> We cannot change the past.  Our only concern should be to develop the
> simplest and cleanest overall implementation possible given the facts as
> they exist today.
>

I understand, but it seems like we don't agree on what is the "cleanest".

>>
>>>
>>> The firmware containing these bindings is out in the wild.  If we
>>> deprecate
>>> some of the bindings, the driver will still have to support them in the
>>> future.
>>>
>>> In the case of OCTEON based devices, the device tree bindings are a
>>> firmware
>>> <--> kernel ABI as the device trees always come from the firmware and not
>>> some other place.
>>
>>
>> Now, I am not going spend more time arguing, instead I prefer if we
>> can be constructive.
>>
>> As the maintainer of MMC, I have tried to be helpful by providing you
>> with my view on how we can move forward.
>>
>> I don't think it's a big deal for you to implement something along
>> those lines for what I have requested, or is it?
>
>
> It is a matter of how much manipulation of the device tree we want to do
> before it is handed off to the driver core for device creation.  I would
> like to not do any.
>
> There is a global cost to changing the device tree in early boot.  It may
> not be borne by the MMC sub-system, but it exists none the less.

I don't think your concern is right.

What I request are *small* updates to the DTB from arch/SoC specific
code, those should be really simple to fix.

The benefit is that the driver becomes more portable and it don't have
to carry around supporting legacy bindings.

Moreover, for new SoCs revisions, which still may re-use the same MMC
controller, the DTB patching isn't needed.

>
> Given that:
>
>  A) The MMC core doesn't contain the concept of one bus controller with
> multiple "slots".
>
>  B) The existing OCTEON device tree bindings should continue to be
> supported.
>
> There are several options.
>
>   1) Invent new MMC device tree bindings that are different from what we
> currently have, but that convey the same information.
>
>   1a) Change the OCTEON MMC driver to use only these new bindings, and write
> some sort of device tree fix up code that runs in early boot to rewrite the
> device tree MMC properties.   This results in the code supporting the OCTEON
> MMC devices being split between the driver and system early boot code.  The
> cost is an increase in system complexity to generate the device tree nodes
> with new bindings.
>
>   1b) Change the OCTEON MMC driver to use either these new bindings or
> legacy bindings.  In this case all OCTEON MMC code is localized to a single
> driver file.  There is a small increase in complexity to carry code to
> decode both new and legacy device tree bindings.
>
>   2) Use existing OCTEON MMC device tree bindings, as they are sufficient to
> achieve a working driver.  Since the code is all specific to the OCTEON MMC
> driver, any ugliness is contained lexicographically near to the features
> being implemented.  Any feedback related to the architecture and style of
> the driver code would be addressed.
>
> The current state is #2.  My interpretation of your desires is #1a.
>
> I am fine with introducing a new device tree binding.  But, I don't think
> the kernel as a whole nor this specific OCTEON MMC driver will be improved
> by adding more device tree synthesis code in early boot.  Any thing that is
> there should be limited to supporting very old (pre OCTEON MMC) firmware
> that doesn't supply a device tree.  So I would strongly support either
> approach #1b or #2.

Let me elaborate once more on how I see the way forward.

For A):
I have suggested a solution that I think can be generic, see my earlier email.

>From the DTB point of view, I request you to update the slot
compatible string to a generic one. Is that a difficult task to patch
the DTB with?
If so, let's keep yours as well, but make sure it's documented as deprecated.

Regarding the changes needed to the mmc core, as to enable it to know
about mmc-slots, this should be quite easy to implement. I even
volunteer to can help, if you think it's needed.

So to summarize regarding A). I want a generic solution for slot nodes!

For B), there are two cases:
1. Legacy bindings that already has a corresponding generic MMC
binding. Renaming these properties by patching the DTB is an easy
operation.

2. Regarding the DT bindings for "power-gpios" and "voltage-ranges".
Under *no* circumstances I won't accept any similar bindings. Instead
a GPIO regulator shall be used and I have explained why in an earlier
response.

I do realize that patching the DTB to create the GPIO regulator is not
an easy task, that was my initial thought but let's just avoid that!

Instead, you can parse the DTB from arch/SoC specific code to find the
slot compatible string. Then continue to search for the
power-gpios/voltage-ranges DT properties and register a GPIO regulator
via the regulator API for the slot-node device. I believe this should
be an easy task for you to implement, right!?

Then the mmc octeon driver can ignore the power-gpios and
voltage-ranges DT bindings and instead just use
mmc_regulator_get_supply() (and other mmc regulator APIs from the mmc
core) to deal with power to the card.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-20  9:32                     ` Ulf Hansson
@ 2016-04-20 22:32                       ` David Daney
  2016-04-20 22:42                         ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: David Daney @ 2016-04-20 22:32 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Matt Redfearn, David Daney, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, Aleksey Makarov, Leonid Rosenboim,
	Peter Swain, Aaron Williams, Rob Herring, Ralf Baechle

On 04/20/2016 02:32 AM, Ulf Hansson wrote:
[...]
>>
>> It is a matter of how much manipulation of the device tree we want to do
>> before it is handed off to the driver core for device creation.  I would
>> like to not do any.
>>
>> There is a global cost to changing the device tree in early boot.  It may
>> not be borne by the MMC sub-system, but it exists none the less.
>
> I don't think your concern is right.
>
> What I request are *small* updates to the DTB from arch/SoC specific
> code, those should be really simple to fix.
>
> The benefit is that the driver becomes more portable and it don't have
> to carry around supporting legacy bindings.
>
> Moreover, for new SoCs revisions, which still may re-use the same MMC
> controller, the DTB patching isn't needed.
>
>>
>> Given that:
>>
>>   A) The MMC core doesn't contain the concept of one bus controller with
>> multiple "slots".
>>
>>   B) The existing OCTEON device tree bindings should continue to be
>> supported.
>>
>> There are several options.
>>
>>    1) Invent new MMC device tree bindings that are different from what we
>> currently have, but that convey the same information.
>>
>>    1a) Change the OCTEON MMC driver to use only these new bindings, and write
>> some sort of device tree fix up code that runs in early boot to rewrite the
>> device tree MMC properties.   This results in the code supporting the OCTEON
>> MMC devices being split between the driver and system early boot code.  The
>> cost is an increase in system complexity to generate the device tree nodes
>> with new bindings.
>>
>>    1b) Change the OCTEON MMC driver to use either these new bindings or
>> legacy bindings.  In this case all OCTEON MMC code is localized to a single
>> driver file.  There is a small increase in complexity to carry code to
>> decode both new and legacy device tree bindings.
>>
>>    2) Use existing OCTEON MMC device tree bindings, as they are sufficient to
>> achieve a working driver.  Since the code is all specific to the OCTEON MMC
>> driver, any ugliness is contained lexicographically near to the features
>> being implemented.  Any feedback related to the architecture and style of
>> the driver code would be addressed.
>>
>> The current state is #2.  My interpretation of your desires is #1a.
>>
>> I am fine with introducing a new device tree binding.  But, I don't think
>> the kernel as a whole nor this specific OCTEON MMC driver will be improved
>> by adding more device tree synthesis code in early boot.  Any thing that is
>> there should be limited to supporting very old (pre OCTEON MMC) firmware
>> that doesn't supply a device tree.  So I would strongly support either
>> approach #1b or #2.
>
> Let me elaborate once more on how I see the way forward.
>
> For A):
> I have suggested a solution that I think can be generic, see my earlier email.
>
>  From the DTB point of view, I request you to update the slot
> compatible string to a generic one. Is that a difficult task to patch
> the DTB with?

It depends on the length of the new compatible property.  If it is 
longer than the old property, then it is much more difficult.


> If so, let's keep yours as well, but make sure it's documented as deprecated.
>
> Regarding the changes needed to the mmc core, as to enable it to know
> about mmc-slots, this should be quite easy to implement. I even
> volunteer to can help, if you think it's needed.
>
> So to summarize regarding A). I want a generic solution for slot nodes!
>
> For B), there are two cases:
> 1. Legacy bindings that already has a corresponding generic MMC
> binding. Renaming these properties by patching the DTB is an easy
> operation.

It is not so easy to rename things in the DTB.  Any renaming causes the 
string table to grow, so you have to have to allocate extra space for 
it.  Currently everything we do with the DTB is done in-place, so you 
would have to rewrite the early DTB handling code to allocate memory and 
make a copy of the DTB.

>
> 2. Regarding the DT bindings for "power-gpios" and "voltage-ranges".
> Under *no* circumstances I won't accept any similar bindings. Instead
> a GPIO regulator shall be used and I have explained why in an earlier
> response.
>

power-gpios is somewhat of a misnomer.  It may not even control the 
device power supply, on some boards it just isolates the data lines so 
the MMC/SD cannot influence the bus.

voltage-ranges I almost think we can drop, and not deal with.


> I do realize that patching the DTB to create the GPIO regulator is not
> an easy task, that was my initial thought but let's just avoid that!
>
> Instead, you can parse the DTB from arch/SoC specific code to find the
> slot compatible string. Then continue to search for the
> power-gpios/voltage-ranges DT properties and register a GPIO regulator
> via the regulator API for the slot-node device. I believe this should
> be an easy task for you to implement, right!?
>
> Then the mmc octeon driver can ignore the power-gpios and
> voltage-ranges DT bindings and instead just use
> mmc_regulator_get_supply() (and other mmc regulator APIs from the mmc
> core) to deal with power to the card.
>

We will consider that.

David Daney



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller
  2016-04-20 22:32                       ` David Daney
@ 2016-04-20 22:42                         ` Arnd Bergmann
  0 siblings, 0 replies; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-20 22:42 UTC (permalink / raw)
  To: David Daney
  Cc: Ulf Hansson, Matt Redfearn, David Daney, linux-mmc,
	Aleksey Makarov, Chandrakala Chavva, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams, Rob Herring,
	Ralf Baechle

On Wednesday 20 April 2016 15:32:56 David Daney wrote:
> >
> > For A):
> > I have suggested a solution that I think can be generic, see my earlier email.
> >
> >  From the DTB point of view, I request you to update the slot
> > compatible string to a generic one. Is that a difficult task to patch
> > the DTB with?
> 
> It depends on the length of the new compatible property.  If it is 
> longer than the old property, then it is much more difficult.
> 
> 
> > If so, let's keep yours as well, but make sure it's documented as deprecated.
> >
> > Regarding the changes needed to the mmc core, as to enable it to know
> > about mmc-slots, this should be quite easy to implement. I even
> > volunteer to can help, if you think it's needed.
> >
> > So to summarize regarding A). I want a generic solution for slot nodes!
> >
> > For B), there are two cases:
> > 1. Legacy bindings that already has a corresponding generic MMC
> > binding. Renaming these properties by patching the DTB is an easy
> > operation.
> 
> It is not so easy to rename things in the DTB.  Any renaming causes the 
> string table to grow, so you have to have to allocate extra space for 
> it.  Currently everything we do with the DTB is done in-place, so you 
> would have to rewrite the early DTB handling code to allocate memory and 
> make a copy of the DTB.

Doesn't libfdt do both of these things for you?

I was expecting that you could just call fdt_setprop() and fdt_set_name(),
but I have not tried this myself.

On ARM, we actually modify the in-kernel devicetree representation
after unflattening, see e.g. arch/arm/mach-mvebu/kirkwood.c
for a file calling of_update_property(), but it seems that octeon
does its fixups at an earlier stagge using libfdt.


	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-19 23:27         ` David Daney
  2016-04-19 23:57           ` Arnd Bergmann
@ 2016-04-21  8:02           ` Ulf Hansson
  2016-04-21 10:15             ` Arnd Bergmann
  1 sibling, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-21  8:02 UTC (permalink / raw)
  To: David Daney
  Cc: Arnd Bergmann, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On 20 April 2016 at 01:27, David Daney <ddaney@caviumnetworks.com> wrote:
> On 04/19/2016 03:09 PM, Arnd Bergmann wrote:
>>
>> On Tuesday 19 April 2016 14:45:35 David Daney wrote:
>>>
>>> On 04/19/2016 01:46 PM, Arnd Bergmann wrote:
>>>>
>>>> On Thursday 31 March 2016 16:26:53 Matt Redfearn wrote:
>>>>>
>>>>> +struct octeon_mmc_host {
>>>>> +       u64     base;
>>>>> +       u64     ndf_base;
>>>>> +       u64     emm_cfg;
>>>>> +       u64     n_minus_one;  /* OCTEON II workaround location */
>>>>> +       int     last_slot;
>>>>> +
>>>>> +       struct semaphore mmc_serializer;
>>>>
>>>>
>>>> Please don't add any new semaphores to the kernel, use a mutex or
>>>> a completion instead.
>>>
>>>
>>> The last time I checked, a mutex could not be used from interrupt
>>> context.
>>>
>>> Since we are in interrupt context and we really want mutex-like behavior
>>> here, it seems like a semaphore is just the thing we need.

So the question I have is *why* do you have to be in IRQ context when
using the semaphore...

I would rather see that you use a threaded IRQ handler, perhaps in
conjunction with a hard IRQ handler if that is needed.

>>>
>>> I am not sure how completions would be of use, perhaps you could
>>> elaborate.
>>
>>
>> Completions are used when you have one thread waiting for an event,
>> which is often an interrupt: the process calls
>> wait_for_completion(&completion); and the interrupt handler calls
>> complete(&completion);
>>
>> It seems that you are using the semaphore for two reasons here (I
>> only read it briefly so I may be wrong):
>> waiting for the interrupt handler and serializing against another
>> thread. In this case you need both a mutex (to guarantee mutual
>> exclusion) and a completion (to wait for the interrupt handler
>> to finish).
>>
>
> The way the MMC driver works is that the driver's .request() method is
> called to initiate a request.   After .request() is finished, it returns
> back to the kernel so other work can be done.

Correct.

Although to clarify a bit more, the mmc core invokes *all* mmc host
ops callbacks from non-atomic context.

>
> From the interrupt handler, when the request is complete, the interrupt
> handler calls req->done(req); to terminate the whole thing.

It may do that, but it's not the recommended method.

Instead it's better if you can deal with the request processing from a
threaded IRQ handler. When completed, you notify the mmc core via
calling mmc_request_done() which kicks the completion variable (as you
describe).

The are several benefits doing request processing from the a threaded
IRQ handler:
1. The obvious one, IRQs don't have to be disabled longer than actually needed.
2. The threaded IRQ handler is able to use mutexes.

>
>
>   So we have:
>
>   CPU-A                      CPU-B                  CPU-C
>
>   octeon_mmc_request(0)        .                     .
>      down()                    .                     .
>      queue_request(0);         .                     .
>      return;                   .                     .
>   other_useful_work            .                     .
>    .                           .                     .
>    .                           .                     .
>    .                           .                     .
>   octeon_mmc_request(1)        .                     .
>      down() -> blocks          .                     .
>                             octeon_mmc_interrupt()   .
>                                 up() -> unblocks     .
>      down() <-unblocks          req->done(0)         .
>      queue_request(1);          return;              .
>      return;                   .                     .
>   other_useful_work            .                     .
>    .                           .                octeon_mmc_interrupt()
>    .                           .                     up()
>    .                           .                     req->done(1)
>    .                           .                     return;
>    .                           .                     .
>
>
> We don't want to have the thread on CPU-A wait around in an extra mutex or
> completion for the command to finish.  The MMC core already has its own
> request waiting code, but it doesn't handle the concept of a slot. These
> commands can take hundreds or thousands of mS to terminate.  The whole idea
> of the MMC framework is to queue the request and get back to doing other
> work ASAP.
>
> In the case of this octeon_mmc driver we need to serialize the commands
> issued to multiple slots, for this we use the semaphore.  If you don't like
> struct semaphore, we would have to invent a proprietary wait queue mechanism
> that has semantics nearly identical to struct semaphore, and people would
> complain that we are reinventing the semaphore.
>
> It doesn't seem clean to cobble up multiple waiting structures (completion +
> mutex + logic that surely would contain errors) where a single (well
> debugged) struct semaphore does what we want.
>

One more thing to be added; In case you need a hard IRQ handler, you
may have to protect it from getting "spurious" IRQs etc. If not, you
can probably use IRQF_ONESHOT when registering the IRQ handler which
should allow you to use only one mutex.

Below I have tried to give you an idea of how I think it can be done,
when you do need a hard IRQ handler. I am using "host->mrq", as what
is being protected by the spinlock.


In the ->request() callback:
....
mutex_lock()
spin_lock_irqsave()

host->mrq = mrq;

spin_unlock_irqrestore()
...
---------------------

In the hard IRQ handler:
...
spin_lock()

if (!host->mrq)
  return IRQ_HANDLED;
else
  return IRQ_WAKE_THREAD;

spin_unlock()
...
---------------------

In the threaded IRQ handler:
...
spin_lock_irqsave()

mrq = host->mrq;

spin_unlock_irqrestore()
...
process request...
...
when request completed:
...
spin_lock_irqsave()

host->mrq = NULL;

spin_unlock_irqrestore()
mutex_unlock()
...
mmc_request_done()
---------------------

Do you think something along these lines should work for your case?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-21  8:02           ` Ulf Hansson
@ 2016-04-21 10:15             ` Arnd Bergmann
  2016-04-21 12:44               ` Ulf Hansson
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-21 10:15 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: David Daney, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Thursday 21 April 2016 10:02:50 Ulf Hansson wrote:
> On 20 April 2016 at 01:27, David Daney <ddaney@caviumnetworks.com> wrote:
> > On 04/19/2016 03:09 PM, Arnd Bergmann wrote:
> >>
> >> On Tuesday 19 April 2016 14:45:35 David Daney wrote:
> >>>
> >>> On 04/19/2016 01:46 PM, Arnd Bergmann wrote:
> >>>>
> >>>> On Thursday 31 March 2016 16:26:53 Matt Redfearn wrote:
> >>>>>
> >>>>> +struct octeon_mmc_host {
> >>>>> +       u64     base;
> >>>>> +       u64     ndf_base;
> >>>>> +       u64     emm_cfg;
> >>>>> +       u64     n_minus_one;  /* OCTEON II workaround location */
> >>>>> +       int     last_slot;
> >>>>> +
> >>>>> +       struct semaphore mmc_serializer;
> >>>>
> >>>>
> >>>> Please don't add any new semaphores to the kernel, use a mutex or
> >>>> a completion instead.
> >>>
> >>>
> >>> The last time I checked, a mutex could not be used from interrupt
> >>> context.
> >>>
> >>> Since we are in interrupt context and we really want mutex-like behavior
> >>> here, it seems like a semaphore is just the thing we need.
> 
> So the question I have is *why* do you have to be in IRQ context when
> using the semaphore...
> 
> I would rather see that you use a threaded IRQ handler, perhaps in
> conjunction with a hard IRQ handler if that is needed.

That does not solve the problem though: it is not allowed for a mutex
to be taken in the request function but released in the interrupt,
both have to be in the same thread.

Using a threaded IRQ handler would help by avoiding the spinlock
inside of it (it could be replaced with a mutex there), but it
doesn't solve the problem of serializing between the slots.

> >>> I am not sure how completions would be of use, perhaps you could
> >>> elaborate.
> >>
> >>
> >> Completions are used when you have one thread waiting for an event,
> >> which is often an interrupt: the process calls
> >> wait_for_completion(&completion); and the interrupt handler calls
> >> complete(&completion);
> >>
> >> It seems that you are using the semaphore for two reasons here (I
> >> only read it briefly so I may be wrong):
> >> waiting for the interrupt handler and serializing against another
> >> thread. In this case you need both a mutex (to guarantee mutual
> >> exclusion) and a completion (to wait for the interrupt handler
> >> to finish).
> >>
> >
> > The way the MMC driver works is that the driver's .request() method is
> > called to initiate a request.   After .request() is finished, it returns
> > back to the kernel so other work can be done.
> 
> Correct.
> 
> Although to clarify a bit more, the mmc core invokes *all* mmc host
> ops callbacks from non-atomic context.

Oh, so you mean the .request() function must not sleep and cannot
call mutex_lock() or down() or wait_event()?

That means we have to come up with a different design anyway. The
easiest is probably to always take a per-host spinlock in both the
.request() function and in the interrupt handler(), but that seems
a bit wasteful because it may take a very long time (hundreds of
miliseconds) for an mmc operation to complete, and we don't want
to hold a spinlock that long.

Another option for that would be to go through a kthread:

- change the .request function to never block but simply pass
  off a request to the kthread
- change the irq handler to just call complete() on the host
  device structure
- in the kthread, go round-robin through all slots, pick up the
  first request you find, fire it off to the hardware and then
  call wait_for_completion() to wait for the irq for that request,
  then start over.

> > From the interrupt handler, when the request is complete, the interrupt
> > handler calls req->done(req); to terminate the whole thing.
> 
> It may do that, but it's not the recommended method.
> 
> Instead it's better if you can deal with the request processing from a
> threaded IRQ handler. When completed, you notify the mmc core via
> calling mmc_request_done() which kicks the completion variable (as you
> describe).
> 
> The are several benefits doing request processing from the a threaded
> IRQ handler:
> 1. The obvious one, IRQs don't have to be disabled longer than actually needed.
> 2. The threaded IRQ handler is able to use mutexes.

I think the mutex only helps if we move the request handling into
a kthread as I described above. After doing that, using a theraded
handler with a mutex is functionally equivalent to having the
existing kthread do the actual irq processing, but it seems a bit
nicer to keep it in a single loop.

It looks to me like calling mmc_request_done() instead of mrq->done()
is a separate issue and should be done anyway.

> > We don't want to have the thread on CPU-A wait around in an extra mutex or
> > completion for the command to finish.  The MMC core already has its own
> > request waiting code, but it doesn't handle the concept of a slot. These
> > commands can take hundreds or thousands of mS to terminate.  The whole idea
> > of the MMC framework is to queue the request and get back to doing other
> > work ASAP.
> >
> > In the case of this octeon_mmc driver we need to serialize the commands
> > issued to multiple slots, for this we use the semaphore.  If you don't like
> > struct semaphore, we would have to invent a proprietary wait queue mechanism
> > that has semantics nearly identical to struct semaphore, and people would
> > complain that we are reinventing the semaphore.
> >
> > It doesn't seem clean to cobble up multiple waiting structures (completion +
> > mutex + logic that surely would contain errors) where a single (well
> > debugged) struct semaphore does what we want.
> >
> 
> One more thing to be added; In case you need a hard IRQ handler, you
> may have to protect it from getting "spurious" IRQs etc. If not, you
> can probably use IRQF_ONESHOT when registering the IRQ handler which
> should allow you to use only one mutex.
> 
> Below I have tried to give you an idea of how I think it can be done,
> when you do need a hard IRQ handler. I am using "host->mrq", as what
> is being protected by the spinlock.
> 
> 
> In the ->request() callback:
> ....
> mutex_lock()
> spin_lock_irqsave()
> 
> host->mrq = mrq;
> 
> spin_unlock_irqrestore()
> ...
> ---------------------
> 
> In the hard IRQ handler:
> ...
> spin_lock()
> 
> if (!host->mrq)
>   return IRQ_HANDLED;
> else
>   return IRQ_WAKE_THREAD;
> 
> spin_unlock()
> ...
> ---------------------
> 
> In the threaded IRQ handler:
> ...
> spin_lock_irqsave()
> 
> mrq = host->mrq;
> 
> spin_unlock_irqrestore()
> ...
> process request...
> ...
> when request completed:
> ...
> spin_lock_irqsave()
> 
> host->mrq = NULL;
> 
> spin_unlock_irqrestore()
> mutex_unlock()
> ...
> mmc_request_done()
> ---------------------
> 
> Do you think something along these lines should work for your case?

This is the case I described above, it is against the rules for mutexes()
and you will get a lockdep warning if you attempt this.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-21 10:15             ` Arnd Bergmann
@ 2016-04-21 12:44               ` Ulf Hansson
  2016-04-21 13:19                 ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-21 12:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Daney, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

[...]

>> So the question I have is *why* do you have to be in IRQ context when
>> using the semaphore...
>>
>> I would rather see that you use a threaded IRQ handler, perhaps in
>> conjunction with a hard IRQ handler if that is needed.
>
> That does not solve the problem though: it is not allowed for a mutex
> to be taken in the request function but released in the interrupt,
> both have to be in the same thread.
>
> Using a threaded IRQ handler would help by avoiding the spinlock
> inside of it (it could be replaced with a mutex there), but it
> doesn't solve the problem of serializing between the slots.

My point is, the may not need to be released in the IRQ handler, but
instead in the threaded IRQ handler.

>
>> >>> I am not sure how completions would be of use, perhaps you could
>> >>> elaborate.
>> >>
>> >>
>> >> Completions are used when you have one thread waiting for an event,
>> >> which is often an interrupt: the process calls
>> >> wait_for_completion(&completion); and the interrupt handler calls
>> >> complete(&completion);
>> >>
>> >> It seems that you are using the semaphore for two reasons here (I
>> >> only read it briefly so I may be wrong):
>> >> waiting for the interrupt handler and serializing against another
>> >> thread. In this case you need both a mutex (to guarantee mutual
>> >> exclusion) and a completion (to wait for the interrupt handler
>> >> to finish).
>> >>
>> >
>> > The way the MMC driver works is that the driver's .request() method is
>> > called to initiate a request.   After .request() is finished, it returns
>> > back to the kernel so other work can be done.
>>
>> Correct.
>>
>> Although to clarify a bit more, the mmc core invokes *all* mmc host
>> ops callbacks from non-atomic context.
>
> Oh, so you mean the .request() function must not sleep and cannot
> call mutex_lock() or down() or wait_event()?

No.

*non-atomic* context. :-)

I should probably said thread/process context instead.

[...]

>>
>>
>> In the ->request() callback:
>> ....
>> mutex_lock()
>> spin_lock_irqsave()
>>
>> host->mrq = mrq;
>>
>> spin_unlock_irqrestore()
>> ...
>> ---------------------
>>
>> In the hard IRQ handler:
>> ...
>> spin_lock()
>>
>> if (!host->mrq)
>>   return IRQ_HANDLED;
>> else
>>   return IRQ_WAKE_THREAD;
>>
>> spin_unlock()
>> ...
>> ---------------------
>>
>> In the threaded IRQ handler:
>> ...
>> spin_lock_irqsave()
>>
>> mrq = host->mrq;
>>
>> spin_unlock_irqrestore()
>> ...
>> process request...
>> ...
>> when request completed:
>> ...
>> spin_lock_irqsave()
>>
>> host->mrq = NULL;
>>
>> spin_unlock_irqrestore()
>> mutex_unlock()
>> ...
>> mmc_request_done()
>> ---------------------
>>
>> Do you think something along these lines should work for your case?
>
> This is the case I described above, it is against the rules for mutexes()
> and you will get a lockdep warning if you attempt this.

No.

Considering my comments in this reply, can you please re-think and see
if my solution could make sense?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-21 12:44               ` Ulf Hansson
@ 2016-04-21 13:19                 ` Arnd Bergmann
  2016-04-22 13:54                   ` Ulf Hansson
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-21 13:19 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: David Daney, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Thursday 21 April 2016 14:44:08 Ulf Hansson wrote:
> [...]
> 
> >> So the question I have is *why* do you have to be in IRQ context when
> >> using the semaphore...
> >>
> >> I would rather see that you use a threaded IRQ handler, perhaps in
> >> conjunction with a hard IRQ handler if that is needed.
> >
> > That does not solve the problem though: it is not allowed for a mutex
> > to be taken in the request function but released in the interrupt,
> > both have to be in the same thread.
> >
> > Using a threaded IRQ handler would help by avoiding the spinlock
> > inside of it (it could be replaced with a mutex there), but it
> > doesn't solve the problem of serializing between the slots.
> 
> My point is, the may not need to be released in the IRQ handler, but
> instead in the threaded IRQ handler.

That doesn't change anything.

> >> Although to clarify a bit more, the mmc core invokes *all* mmc host
> >> ops callbacks from non-atomic context.
> >
> > Oh, so you mean the .request() function must not sleep and cannot
> > call mutex_lock() or down() or wait_event()?
> 
> No.
> 
> *non-atomic* context. :-)
> 
> I should probably said thread/process context instead.
> 

My mistake, you were being clear enough here, I just misread.

> >> In the ->request() callback:
> >> ....
> >> mutex_lock()

> >> In the threaded IRQ handler:
> >> ...
> >> mutex_unlock()
> >>
> >> Do you think something along these lines should work for your case?
> >
> > This is the case I described above, it is against the rules for mutexes()
> > and you will get a lockdep warning if you attempt this.
> 
> No.
> 
> Considering my comments in this reply, can you please re-think and see
> if my solution could make sense?

The problem is not that mutex_unlock() has to be called from non-atomic
context (it doesn't, you are allowed  to call mutex_unlock while holding
a spinlock), but instead the problem is that it has to be done from the
same thread that called mutex_lock(), and the threaded IRQ handler
is never the same thread that is currently holding the mutex.

My suggestion with using

	wait_event(host->wq, !cmpxchg(host->current_req, NULL, mrq));

should sufficiently solve the problem, but the suggestion of using
a kthread (even though not needed for taking a mutex) would still
have some advantages and one disadvantage:

+ We never need to spin in the irq context (also achievable using
  a threaded handler)
+ The request callback always returns immediately after queuing up
  the request to the kthread, rather than blocking for a potentially
  long time while waiting for an operation in another slot to complete
+ it very easily avoids the problem of serialization between
  the slots, and ensures that each slot gets an equal chance to
  send the next request.
- you get a slightly higher latency for waking up the kthread in
  oder to do a simple request (comparable amount of latency that
  is introduced by an irq thread).

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-21 13:19                 ` Arnd Bergmann
@ 2016-04-22 13:54                   ` Ulf Hansson
  2016-04-22 16:42                     ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: Ulf Hansson @ 2016-04-22 13:54 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Daney, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On 21 April 2016 at 15:19, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 21 April 2016 14:44:08 Ulf Hansson wrote:
>> [...]
>>
>> >> So the question I have is *why* do you have to be in IRQ context when
>> >> using the semaphore...
>> >>
>> >> I would rather see that you use a threaded IRQ handler, perhaps in
>> >> conjunction with a hard IRQ handler if that is needed.
>> >
>> > That does not solve the problem though: it is not allowed for a mutex
>> > to be taken in the request function but released in the interrupt,
>> > both have to be in the same thread.
>> >
>> > Using a threaded IRQ handler would help by avoiding the spinlock
>> > inside of it (it could be replaced with a mutex there), but it
>> > doesn't solve the problem of serializing between the slots.
>>
>> My point is, the may not need to be released in the IRQ handler, but
>> instead in the threaded IRQ handler.
>
> That doesn't change anything.
>
>> >> Although to clarify a bit more, the mmc core invokes *all* mmc host
>> >> ops callbacks from non-atomic context.
>> >
>> > Oh, so you mean the .request() function must not sleep and cannot
>> > call mutex_lock() or down() or wait_event()?
>>
>> No.
>>
>> *non-atomic* context. :-)
>>
>> I should probably said thread/process context instead.
>>
>
> My mistake, you were being clear enough here, I just misread.
>
>> >> In the ->request() callback:
>> >> ....
>> >> mutex_lock()
>
>> >> In the threaded IRQ handler:
>> >> ...
>> >> mutex_unlock()
>> >>
>> >> Do you think something along these lines should work for your case?
>> >
>> > This is the case I described above, it is against the rules for mutexes()
>> > and you will get a lockdep warning if you attempt this.
>>
>> No.
>>
>> Considering my comments in this reply, can you please re-think and see
>> if my solution could make sense?
>
> The problem is not that mutex_unlock() has to be called from non-atomic
> context (it doesn't, you are allowed  to call mutex_unlock while holding
> a spinlock), but instead the problem is that it has to be done from the
> same thread that called mutex_lock(), and the threaded IRQ handler
> is never the same thread that is currently holding the mutex.

Of course, you are right!

>
> My suggestion with using
>
>         wait_event(host->wq, !cmpxchg(host->current_req, NULL, mrq));
>
> should sufficiently solve the problem, but the suggestion of using
> a kthread (even though not needed for taking a mutex) would still
> have some advantages and one disadvantage:
>
> + We never need to spin in the irq context (also achievable using
>   a threaded handler)
> + The request callback always returns immediately after queuing up
>   the request to the kthread, rather than blocking for a potentially
>   long time while waiting for an operation in another slot to complete
> + it very easily avoids the problem of serialization between
>   the slots, and ensures that each slot gets an equal chance to
>   send the next request.
> - you get a slightly higher latency for waking up the kthread in
>   oder to do a simple request (comparable amount of latency that
>   is introduced by an irq thread).
>

Currently I can't think of anything better, so I guess something along
these lines is worth a try.

No matter what, I guess we want to avoid to use a semaphore as long as
possible, right!?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-22 13:54                   ` Ulf Hansson
@ 2016-04-22 16:42                     ` Arnd Bergmann
  2016-04-22 17:49                       ` David Daney
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-22 16:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: David Daney, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Friday 22 April 2016 15:54:56 Ulf Hansson wrote:
> 
> >
> > My suggestion with using
> >
> >         wait_event(host->wq, !cmpxchg(host->current_req, NULL, mrq));
> >
> > should sufficiently solve the problem, but the suggestion of using
> > a kthread (even though not needed for taking a mutex) would still
> > have some advantages and one disadvantage:
> >
> > + We never need to spin in the irq context (also achievable using
> >   a threaded handler)
> > + The request callback always returns immediately after queuing up
> >   the request to the kthread, rather than blocking for a potentially
> >   long time while waiting for an operation in another slot to complete
> > + it very easily avoids the problem of serialization between
> >   the slots, and ensures that each slot gets an equal chance to
> >   send the next request.
> > - you get a slightly higher latency for waking up the kthread in
> >   oder to do a simple request (comparable amount of latency that
> >   is introduced by an irq thread).
> >
> 
> Currently I can't think of anything better, so I guess something along
> these lines is worth a try.
> 
> No matter what, I guess we want to avoid to use a semaphore as long as
> possible, right!?

Yes, I think that would be good, to avoid curses from whoever tries
to eliminate them the next time.

I think there is some renewed interest in realtime kernels these
days, and semaphores are known to cause problems with priority
inversion (precisely because you don't know which thread will
release it).

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-22 16:42                     ` Arnd Bergmann
@ 2016-04-22 17:49                       ` David Daney
  2016-04-22 20:23                         ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: David Daney @ 2016-04-22 17:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ulf Hansson, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On 04/22/2016 09:42 AM, Arnd Bergmann wrote:
> On Friday 22 April 2016 15:54:56 Ulf Hansson wrote:
>>
>>>
>>> My suggestion with using
>>>
>>>          wait_event(host->wq, !cmpxchg(host->current_req, NULL, mrq));
>>>
>>> should sufficiently solve the problem, but the suggestion of using
>>> a kthread (even though not needed for taking a mutex) would still
>>> have some advantages and one disadvantage:
>>>
>>> + We never need to spin in the irq context (also achievable using
>>>    a threaded handler)
>>> + The request callback always returns immediately after queuing up
>>>    the request to the kthread, rather than blocking for a potentially
>>>    long time while waiting for an operation in another slot to complete
>>> + it very easily avoids the problem of serialization between
>>>    the slots, and ensures that each slot gets an equal chance to
>>>    send the next request.
>>> - you get a slightly higher latency for waking up the kthread in
>>>    oder to do a simple request (comparable amount of latency that
>>>    is introduced by an irq thread).
>>>
>>
>> Currently I can't think of anything better, so I guess something along
>> these lines is worth a try.
>>
>> No matter what, I guess we want to avoid to use a semaphore as long as
>> possible, right!?
>
> Yes, I think that would be good, to avoid curses from whoever tries
> to eliminate them the next time.
>
> I think there is some renewed interest in realtime kernels these
> days, and semaphores are known to cause problems with priority
> inversion (precisely because you don't know which thread will
> release it).

In this particular case, there can be no priority inversion, as the 
thing we are waiting for is a hardware event.  The timing of that is not 
influenced by task scheduling.  The use of wait_event instead of struct 
semaphore would be purely cosmetic.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver for OCTEON MMC controller
  2016-04-22 17:49                       ` David Daney
@ 2016-04-22 20:23                         ` Arnd Bergmann
  0 siblings, 0 replies; 32+ messages in thread
From: Arnd Bergmann @ 2016-04-22 20:23 UTC (permalink / raw)
  To: David Daney
  Cc: Ulf Hansson, Matt Redfearn, linux-mmc, Aleksey Makarov,
	Chandrakala Chavva, David Daney, Aleksey Makarov,
	Leonid Rosenboim, Peter Swain, Aaron Williams

On Friday 22 April 2016 10:49:06 David Daney wrote:
> On 04/22/2016 09:42 AM, Arnd Bergmann wrote:
> > On Friday 22 April 2016 15:54:56 Ulf Hansson wrote:
> >>
> >>>
> >>> My suggestion with using
> >>>
> >>>          wait_event(host->wq, !cmpxchg(host->current_req, NULL, mrq));
> >>>
> >>> should sufficiently solve the problem, but the suggestion of using
> >>> a kthread (even though not needed for taking a mutex) would still
> >>> have some advantages and one disadvantage:
> >>>
> >>> + We never need to spin in the irq context (also achievable using
> >>>    a threaded handler)
> >>> + The request callback always returns immediately after queuing up
> >>>    the request to the kthread, rather than blocking for a potentially
> >>>    long time while waiting for an operation in another slot to complete
> >>> + it very easily avoids the problem of serialization between
> >>>    the slots, and ensures that each slot gets an equal chance to
> >>>    send the next request.
> >>> - you get a slightly higher latency for waking up the kthread in
> >>>    oder to do a simple request (comparable amount of latency that
> >>>    is introduced by an irq thread).
> >>>
> >>
> >> Currently I can't think of anything better, so I guess something along
> >> these lines is worth a try.
> >>
> >> No matter what, I guess we want to avoid to use a semaphore as long as
> >> possible, right!?
> >
> > Yes, I think that would be good, to avoid curses from whoever tries
> > to eliminate them the next time.
> >
> > I think there is some renewed interest in realtime kernels these
> > days, and semaphores are known to cause problems with priority
> > inversion (precisely because you don't know which thread will
> > release it).
> 
> In this particular case, there can be no priority inversion, as the 
> thing we are waiting for is a hardware event.  The timing of that is not 
> influenced by task scheduling.  The use of wait_event instead of struct 
> semaphore would be purely cosmetic.

My point was just that it's possible someone will try to remove all
the semaphores again soon. You are right that any possible priority
inversion in the existing code would not get changed by using a
different method to wait for the completion of the request.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-04-22 20:24 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-31 15:26 [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings for OCTEON MMC controller Matt Redfearn
2016-03-31 15:26 ` [RESEND PATCH v7 2/2] mmc: OCTEON: Add host driver " Matt Redfearn
2016-04-19 20:46   ` Arnd Bergmann
2016-04-19 21:45     ` David Daney
2016-04-19 22:09       ` Arnd Bergmann
2016-04-19 23:27         ` David Daney
2016-04-19 23:57           ` Arnd Bergmann
2016-04-20  0:02             ` Arnd Bergmann
2016-04-21  8:02           ` Ulf Hansson
2016-04-21 10:15             ` Arnd Bergmann
2016-04-21 12:44               ` Ulf Hansson
2016-04-21 13:19                 ` Arnd Bergmann
2016-04-22 13:54                   ` Ulf Hansson
2016-04-22 16:42                     ` Arnd Bergmann
2016-04-22 17:49                       ` David Daney
2016-04-22 20:23                         ` Arnd Bergmann
2016-04-14 12:45 ` [RESEND PATCH v7 1/2] mmc: OCTEON: Add DT bindings " Ulf Hansson
2016-04-18  8:53   ` Matt Redfearn
2016-04-18 11:13     ` Ulf Hansson
2016-04-18 11:37       ` Matt Redfearn
2016-04-18 12:08         ` Ulf Hansson
2016-04-18 12:57           ` Matt Redfearn
2016-04-18 22:59             ` David Daney
2016-04-19  9:15             ` Ulf Hansson
2016-04-19 16:13               ` David Daney
2016-04-19 19:33                 ` Ulf Hansson
2016-04-19 20:25                   ` David Daney
2016-04-19 20:56                     ` Arnd Bergmann
2016-04-19 21:50                       ` David Daney
2016-04-20  9:32                     ` Ulf Hansson
2016-04-20 22:32                       ` David Daney
2016-04-20 22:42                         ` Arnd Bergmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.