All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>,
	Peter Ujfalusi <peter.ujfalusi@ti.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.9 09/21] drm/omap: fix memory barrier bug in DMM driver
Date: Sun,  4 Nov 2018 08:53:43 -0500	[thread overview]
Message-ID: <20181104135355.88602-9-sashal@kernel.org> (raw)
In-Reply-To: <20181104135355.88602-1-sashal@kernel.org>

From: Tomi Valkeinen <tomi.valkeinen@ti.com>

[ Upstream commit 538f66ba204944470a653a4cccc5f8befdf97c22 ]

A DMM timeout "timed out waiting for done" has been observed on DRA7
devices. The timeout happens rarely, and only when the system is under
heavy load.

Debugging showed that the timeout can be made to happen much more
frequently by optimizing the DMM driver, so that there's almost no code
between writing the last DMM descriptors to RAM, and writing to DMM
register which starts the DMM transaction.

The current theory is that a wmb() does not properly ensure that the
data written to RAM is observable by all the components in the system.

This DMM timeout has caused interesting (and rare) bugs as the error
handling was not functioning properly (the error handling has been fixed
in previous commits):

 * If a DMM timeout happened when a GEM buffer was being pinned for
   display on the screen, a timeout error would be shown, but the driver
   would continue programming DSS HW with broken buffer, leading to
   SYNCLOST floods and possible crashes.

 * If a DMM timeout happened when other user (say, video decoder) was
   pinning a GEM buffer, a timeout would be shown but if the user
   handled the error properly, no other issues followed.

 * If a DMM timeout happened when a GEM buffer was being released, the
   driver does not even notice the error, leading to crashes or hang
   later.

This patch adds wmb() and readl() calls after the last bit is written to
RAM, which should ensure that the execution proceeds only after the data
is actually in RAM, and thus observable by DMM.

The read-back should not be needed. Further study is required to understand
if DMM is somehow special case and read-back is ok, or if DRA7's memory
barriers do not work correctly.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
index 7def04049498..6a0b25e0823f 100644
--- a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
+++ b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
@@ -273,6 +273,17 @@ static int dmm_txn_commit(struct dmm_txn *txn, bool wait)
 	}
 
 	txn->last_pat->next_pa = 0;
+	/* ensure that the written descriptors are visible to DMM */
+	wmb();
+
+	/*
+	 * NOTE: the wmb() above should be enough, but there seems to be a bug
+	 * in OMAP's memory barrier implementation, which in some rare cases may
+	 * cause the writes not to be observable after wmb().
+	 */
+
+	/* read back to ensure the data is in RAM */
+	readl(&txn->last_pat->next_pa);
 
 	/* write to PAT_DESCR to clear out any pending transaction */
 	dmm_write(dmm, 0x0, reg[PAT_DESCR][engine->id]);
-- 
2.17.1


  parent reply	other threads:[~2018-11-04 13:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-04 13:53 [PATCH AUTOSEL 4.9 01/21] mm/vmstat.c: assert that vmstat_text is in sync with stat_items_size Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 02/21] userfaultfd: allow get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) to trigger userfaults Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 03/21] mm: don't warn about large allocations for slab Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 04/21] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 05/21] tty: check name length in tty_find_polling_driver() Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 06/21] ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 07/21] powerpc/nohash: fix undefined behaviour when testing page size support Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 08/21] watchdog: lantiq: update register names to better match spec Sasha Levin
2018-11-04 13:53 ` Sasha Levin [this message]
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 10/21] iio: adc: at91: fix wrong channel number in triggered buffer mode Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 11/21] iio: adc: at91: fix acking DRDY irq on simple conversions Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 12/21] media: pci: cx23885: handle adding to list failure Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 13/21] iio: adc: imx25-gcq: Fix leak of device_node in mx25_gcq_setup_cfgs() Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 14/21] MIPS: kexec: Mark CPU offline before disabling local IRQ Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 15/21] powerpc/boot: Ensure _zimage_start is a weak symbol Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 16/21] MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 17/21] sc16is7xx: Fix for multi-channel stall Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 18/21] media: tvp5150: fix width alignment during set_selection() Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 19/21] powerpc/selftests: Wait all threads to join Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 20/21] 9p locks: fix glock.client_id leak in do_lock Sasha Levin
2018-11-04 13:53 ` [PATCH AUTOSEL 4.9 21/21] 9p: clear dangling pointers in p9stat_free Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181104135355.88602-9-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter.ujfalusi@ti.com \
    --cc=stable@vger.kernel.org \
    --cc=tomi.valkeinen@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.