All of lore.kernel.org
 help / color / mirror / Atom feed
* entering the error case of i2c-designware with a timeout at probe
@ 2017-03-21 12:52 Oliver Neukum
  2017-03-21 12:57 ` Max Staudt
  2017-03-21 13:00 ` Hans de Goede
  0 siblings, 2 replies; 10+ messages in thread
From: Oliver Neukum @ 2017-03-21 12:52 UTC (permalink / raw)
  To: Hans de Goede; +Cc: linux-i2c, mstaudt

Hi Hans,

we found on our test systems with a bit of experimentation,
that actually running into the timeout is bound to hang the whole
system within only a few seconds.
I was wondering whether the error handling needs to be changed.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 12:52 entering the error case of i2c-designware with a timeout at probe Oliver Neukum
@ 2017-03-21 12:57 ` Max Staudt
  2017-03-21 13:01   ` Hans de Goede
  2017-03-21 13:00 ` Hans de Goede
  1 sibling, 1 reply; 10+ messages in thread
From: Max Staudt @ 2017-03-21 12:57 UTC (permalink / raw)
  To: Oliver Neukum, Hans de Goede; +Cc: linux-i2c

On 03/21/2017 01:52 PM, Oliver Neukum wrote:
> Hi Hans,
> 
> we found on our test systems with a bit of experimentation,
> that actually running into the timeout is bound to hang the whole
> system within only a few seconds.
> I was wondering whether the error handling needs to be changed.

In other words, whether we should rather wait for lock acquisition,
unconditionally. No timeout, just wait. That's what our hardware
seems to need.

It feels like once the lock has been requested by the Linux driver,
we can't cancel that request and have to actually follow through
with accepting the lock and only giving it back after that.
Resetting the "request" bit to 0 as it is done now doesn't work as
it leads to the hung system sometime soon after that.


Max

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 12:52 entering the error case of i2c-designware with a timeout at probe Oliver Neukum
  2017-03-21 12:57 ` Max Staudt
@ 2017-03-21 13:00 ` Hans de Goede
  1 sibling, 0 replies; 10+ messages in thread
From: Hans de Goede @ 2017-03-21 13:00 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-i2c, mstaudt

[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]

Hi,

On 21-03-17 13:52, Oliver Neukum wrote:
> Hi Hans,
>
> we found on our test systems with a bit of experimentation,
> that actually running into the timeout is bound to hang the whole
> system within only a few seconds.

AFAICT running into the timeout may be caused by the bus being
stuck, which seems to happen when the designware controller and
the punit try to access the bus simultaneously, which is exactly
what my patches try to avoid.

But yeah once that happens the system usually needs a hard reset /
powercycle to recover.

I assume this is on Cherry Trail with my full patch-set including
the forcewake fixes ?

One thing you could do is try the attach patches which got
dropped from the set as it did not seem necessary.

> I was wondering whether the error handling needs to be changed.

Are we talking the error case in i2c-designware-baytrail.c here ?

I believe that the error handling there is correct (bail do not
allow an i2c transfer to even be attempted) the problem is more
likely that the punit sometimes does do accesses even when we
hold the semaphore. The trick is to find those accesses (assuming
they are triggered from the kernel) and add mutex protection to
them, as the attached patch tries to do for i915.

Regards,

Hans



[-- Attachment #2: 0001-drm-i915-Acquire-P-Unit-access-when-modifying-P-Unit.patch --]
[-- Type: text/x-patch, Size: 6744 bytes --]

>From ad779e820fadb7c42ea1c609e177b07f98c22a21 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede@redhat.com>
Date: Sun, 1 Jan 2017 13:56:11 +0100
Subject: [PATCH v8] drm/i915: Acquire P-Unit access when modifying P-Unit
 settings

Make sure the P-Unit or the PMIC i2c bus is not in use when we send a
request to the P-Unit by calling iosf_mbi_punit_acquire() / _release()
around P-Unit write accesses.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=155241
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: tagorereddy <tagore.chandan@gmail.com>
---
Changes in v2:
-Spelling: P-Unit, PMIC
-Adjust for iosf_mbi_punit_lock/_unlock to _acquire/_release rename
Changes in v3:
-Make sure IOSF_MBI is builtin if the i915 driver is builtin
-Add comments explaining why calling iosf_mbi_punit_acquire is necessary
---
 drivers/gpu/drm/i915/intel_display.c    |  8 ++++++++
 drivers/gpu/drm/i915/intel_pm.c         | 13 +++++++++++++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 11 +++++++++++
 3 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 88689a0..cc54378 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -47,6 +47,7 @@
 #include <drm/drm_rect.h>
 #include <linux/dma_remapping.h>
 #include <linux/reservation.h>
+#include <asm/iosf_mbi.h>
 
 static bool is_mmio_work(struct intel_flip_work *work)
 {
@@ -6422,6 +6423,9 @@ static void valleyview_set_cdclk(struct drm_device *dev, int cdclk)
 		cmd = 0;
 
 	mutex_lock(&dev_priv->rps.hw_lock);
+	/* Claim the pmic bus on systems where the punit shares the pmic bus */
+	iosf_mbi_punit_acquire();
+
 	val = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
 	val &= ~DSPFREQGUAR_MASK;
 	val |= (cmd << DSPFREQGUAR_SHIFT);
@@ -6431,6 +6435,7 @@ static void valleyview_set_cdclk(struct drm_device *dev, int cdclk)
 		     50)) {
 		DRM_ERROR("timed out waiting for CDclk change\n");
 	}
+	iosf_mbi_punit_release();
 	mutex_unlock(&dev_priv->rps.hw_lock);
 
 	mutex_lock(&dev_priv->sb_lock);
@@ -6498,6 +6503,8 @@ static void cherryview_set_cdclk(struct drm_device *dev, int cdclk)
 	cmd = DIV_ROUND_CLOSEST(dev_priv->hpll_freq << 1, cdclk) - 1;
 
 	mutex_lock(&dev_priv->rps.hw_lock);
+	/* Claim the pmic bus on systems where the punit shares the pmic bus */
+	iosf_mbi_punit_acquire();
 	val = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
 	val &= ~DSPFREQGUAR_MASK_CHV;
 	val |= (cmd << DSPFREQGUAR_SHIFT_CHV);
@@ -6507,6 +6514,7 @@ static void cherryview_set_cdclk(struct drm_device *dev, int cdclk)
 		     50)) {
 		DRM_ERROR("timed out waiting for CDclk change\n");
 	}
+	iosf_mbi_punit_release();
 	mutex_unlock(&dev_priv->rps.hw_lock);
 
 	intel_update_cdclk(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ec16f3d6..d2498b8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -32,6 +32,7 @@
 #include "../../../platform/x86/intel_ips.h"
 #include <linux/module.h>
 #include <drm/drm_atomic_helper.h>
+#include <asm/iosf_mbi.h>
 
 /**
  * DOC: RC6
@@ -289,6 +290,8 @@ static void chv_set_memory_dvfs(struct drm_i915_private *dev_priv, bool enable)
 	u32 val;
 
 	mutex_lock(&dev_priv->rps.hw_lock);
+	/* Claim the pmic bus on systems where the punit shares the pmic bus */
+	iosf_mbi_punit_acquire();
 
 	val = vlv_punit_read(dev_priv, PUNIT_REG_DDR_SETUP2);
 	if (enable)
@@ -303,6 +306,7 @@ static void chv_set_memory_dvfs(struct drm_i915_private *dev_priv, bool enable)
 		      FORCE_DDR_FREQ_REQ_ACK) == 0, 3))
 		DRM_ERROR("timed out waiting for Punit DDR DVFS request\n");
 
+	iosf_mbi_punit_release();
 	mutex_unlock(&dev_priv->rps.hw_lock);
 }
 
@@ -311,6 +315,8 @@ static void chv_set_memory_pm5(struct drm_i915_private *dev_priv, bool enable)
 	u32 val;
 
 	mutex_lock(&dev_priv->rps.hw_lock);
+	/* Claim the pmic bus on systems where the punit shares the pmic bus */
+	iosf_mbi_punit_acquire();
 
 	val = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
 	if (enable)
@@ -319,6 +325,7 @@ static void chv_set_memory_pm5(struct drm_i915_private *dev_priv, bool enable)
 		val &= ~DSP_MAXFIFO_PM5_ENABLE;
 	vlv_punit_write(dev_priv, PUNIT_REG_DSPFREQ, val);
 
+	iosf_mbi_punit_release();
 	mutex_unlock(&dev_priv->rps.hw_lock);
 }
 
@@ -4565,6 +4572,8 @@ void vlv_wm_get_hw_state(struct drm_device *dev)
 
 	if (IS_CHERRYVIEW(dev_priv)) {
 		mutex_lock(&dev_priv->rps.hw_lock);
+		/* Claim the pmic on systems where the punit shares its bus */
+		iosf_mbi_punit_acquire();
 
 		val = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
 		if (val & DSP_MAXFIFO_PM5_ENABLE)
@@ -4594,6 +4603,7 @@ void vlv_wm_get_hw_state(struct drm_device *dev)
 				wm->level = VLV_WM_LEVEL_DDR_DVFS;
 		}
 
+		iosf_mbi_punit_release();
 		mutex_unlock(&dev_priv->rps.hw_lock);
 	}
 
@@ -5000,7 +5010,10 @@ static int valleyview_set_rps(struct drm_i915_private *dev_priv, u8 val)
 	I915_WRITE(GEN6_PMINTRMSK, gen6_rps_pm_mask(dev_priv, val));
 
 	if (val != dev_priv->rps.cur_freq) {
+		/* Claim the pmic on systems where the punit shares its bus */
+		iosf_mbi_punit_acquire();
 		err = vlv_punit_write(dev_priv, PUNIT_REG_GPU_FREQ_REQ, val);
+		iosf_mbi_punit_release();
 		if (err)
 			return err;
 
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 66aa1bb..4704b40 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -28,6 +28,7 @@
 
 #include <linux/pm_runtime.h>
 #include <linux/vgaarb.h>
+#include <asm/iosf_mbi.h>
 
 #include "i915_drv.h"
 #include "intel_drv.h"
@@ -1027,6 +1028,9 @@ static void vlv_set_power_well(struct drm_i915_private *dev_priv,
 	if (COND)
 		goto out;
 
+	/* Claim the pmic bus on systems where the punit shares the pmic bus */
+	iosf_mbi_punit_acquire();
+
 	ctrl = vlv_punit_read(dev_priv, PUNIT_REG_PWRGT_CTRL);
 	ctrl &= ~mask;
 	ctrl |= state;
@@ -1037,6 +1041,8 @@ static void vlv_set_power_well(struct drm_i915_private *dev_priv,
 			  state,
 			  vlv_punit_read(dev_priv, PUNIT_REG_PWRGT_CTRL));
 
+	iosf_mbi_punit_release();
+
 #undef COND
 
 out:
@@ -1643,6 +1649,9 @@ static void chv_set_pipe_power_well(struct drm_i915_private *dev_priv,
 	if (COND)
 		goto out;
 
+	/* Claim the pmic bus on systems where the punit shares the pmic bus */
+	iosf_mbi_punit_acquire();
+
 	ctrl = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
 	ctrl &= ~DP_SSC_MASK(pipe);
 	ctrl |= enable ? DP_SSC_PWR_ON(pipe) : DP_SSC_PWR_GATE(pipe);
@@ -1653,6 +1662,8 @@ static void chv_set_pipe_power_well(struct drm_i915_private *dev_priv,
 			  state,
 			  vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ));
 
+	iosf_mbi_punit_release();
+
 #undef COND
 
 out:
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 12:57 ` Max Staudt
@ 2017-03-21 13:01   ` Hans de Goede
  2017-03-21 13:36     ` Oliver Neukum
  0 siblings, 1 reply; 10+ messages in thread
From: Hans de Goede @ 2017-03-21 13:01 UTC (permalink / raw)
  To: Max Staudt, Oliver Neukum; +Cc: linux-i2c

Hi,

On 21-03-17 13:57, Max Staudt wrote:
> On 03/21/2017 01:52 PM, Oliver Neukum wrote:
>> Hi Hans,
>>
>> we found on our test systems with a bit of experimentation,
>> that actually running into the timeout is bound to hang the whole
>> system within only a few seconds.
>> I was wondering whether the error handling needs to be changed.
>
> In other words, whether we should rather wait for lock acquisition,
> unconditionally. No timeout, just wait. That's what our hardware
> seems to need.
>
> It feels like once the lock has been requested by the Linux driver,
> we can't cancel that request and have to actually follow through
> with accepting the lock and only giving it back after that.
> Resetting the "request" bit to 0 as it is done now doesn't work as
> it leads to the hung system sometime soon after that.

Hmm, interesting theory. I would say give it a test and if it
helps then maybe increase the timeouts to say 10 seconds or
such, so that e.g. on poweroff we at least report an error
rather then just sitting there ?

Regards,

Hans

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 13:01   ` Hans de Goede
@ 2017-03-21 13:36     ` Oliver Neukum
  2017-03-21 13:55       ` Hans de Goede
  0 siblings, 1 reply; 10+ messages in thread
From: Oliver Neukum @ 2017-03-21 13:36 UTC (permalink / raw)
  To: Hans de Goede, Max Staudt; +Cc: linux-i2c

Am Dienstag, den 21.03.2017, 14:01 +0100 schrieb Hans de Goede:
> Hi,
> 
> On 21-03-17 13:57, Max Staudt wrote:
> > 
> > On 03/21/2017 01:52 PM, Oliver Neukum wrote:
> > > 
> > > Hi Hans,
> > > 
> > > we found on our test systems with a bit of experimentation,
> > > that actually running into the timeout is bound to hang the whole
> > > system within only a few seconds.
> > > I was wondering whether the error handling needs to be changed.
> > 
> > In other words, whether we should rather wait for lock acquisition,
> > unconditionally. No timeout, just wait. That's what our hardware
> > seems to need.
> > 
> > It feels like once the lock has been requested by the Linux driver,
> > we can't cancel that request and have to actually follow through
> > with accepting the lock and only giving it back after that.
> > Resetting the "request" bit to 0 as it is done now doesn't work as
> > it leads to the hung system sometime soon after that.
> 
> Hmm, interesting theory. I would say give it a test and if it
> helps then maybe increase the timeouts to say 10 seconds or
> such, so that e.g. on poweroff we at least report an error
> rather then just sitting there ?

I did some testing. Unconditionally taking the error path without waiting
for the semaphore crashes or freezes the machine in about 3/4 of all
tests.
As I said, with a timeout of 500ms, this issue is not seen.
Do you have reliable information that the error handling works
on BayTrail?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 13:36     ` Oliver Neukum
@ 2017-03-21 13:55       ` Hans de Goede
  2017-03-21 14:05         ` Oliver Neukum
  0 siblings, 1 reply; 10+ messages in thread
From: Hans de Goede @ 2017-03-21 13:55 UTC (permalink / raw)
  To: Oliver Neukum, Max Staudt; +Cc: linux-i2c

Hi,

On 21-03-17 14:36, Oliver Neukum wrote:
> Am Dienstag, den 21.03.2017, 14:01 +0100 schrieb Hans de Goede:
>> Hi,
>>
>> On 21-03-17 13:57, Max Staudt wrote:
>>>
>>> On 03/21/2017 01:52 PM, Oliver Neukum wrote:
>>>>
>>>> Hi Hans,
>>>>
>>>> we found on our test systems with a bit of experimentation,
>>>> that actually running into the timeout is bound to hang the whole
>>>> system within only a few seconds.
>>>> I was wondering whether the error handling needs to be changed.
>>>
>>> In other words, whether we should rather wait for lock acquisition,
>>> unconditionally. No timeout, just wait. That's what our hardware
>>> seems to need.
>>>
>>> It feels like once the lock has been requested by the Linux driver,
>>> we can't cancel that request and have to actually follow through
>>> with accepting the lock and only giving it back after that.
>>> Resetting the "request" bit to 0 as it is done now doesn't work as
>>> it leads to the hung system sometime soon after that.
>>
>> Hmm, interesting theory. I would say give it a test and if it
>> helps then maybe increase the timeouts to say 10 seconds or
>> such, so that e.g. on poweroff we at least report an error
>> rather then just sitting there ?
>
> I did some testing. Unconditionally taking the error path without waiting
> for the semaphore crashes or freezes the machine in about 3/4 of all
> tests.
> As I said, with a timeout of 500ms, this issue is not seen.

Ah ok, so that patch is enough to fix this ? Then as I already
said I'm fine with that patch, needing long timeouts unfortunately
seems normal when dealing with embedded micro-controllers, I've
seen the same with e.g. ACPI embedded-controllers.

> Do you have reliable information that the error handling works
> on BayTrail?

No, Bay Trail actually is known to not always be stable with Linux.

Regards,

Hans

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 13:55       ` Hans de Goede
@ 2017-03-21 14:05         ` Oliver Neukum
  2017-03-21 14:48           ` Hans de Goede
  0 siblings, 1 reply; 10+ messages in thread
From: Oliver Neukum @ 2017-03-21 14:05 UTC (permalink / raw)
  To: Hans de Goede, Max Staudt; +Cc: linux-i2c

Am Dienstag, den 21.03.2017, 14:55 +0100 schrieb Hans de Goede:
> Hi,
> 
> On 21-03-17 14:36, Oliver Neukum wrote:
> > 
> > Am Dienstag, den 21.03.2017, 14:01 +0100 schrieb Hans de Goede:
> > > 
> > > Hi,
> > > 
> > > On 21-03-17 13:57, Max Staudt wrote:

Hello,

> > > > In other words, whether we should rather wait for lock acquisition,
> > > > unconditionally. No timeout, just wait. That's what our hardware
> > > > seems to need.
> > > > 
> > > > It feels like once the lock has been requested by the Linux driver,
> > > > we can't cancel that request and have to actually follow through
> > > > with accepting the lock and only giving it back after that.
> > > > Resetting the "request" bit to 0 as it is done now doesn't work as
> > > > it leads to the hung system sometime soon after that.
> > > 
> > > Hmm, interesting theory. I would say give it a test and if it
> > > helps then maybe increase the timeouts to say 10 seconds or
> > > such, so that e.g. on poweroff we at least report an error
> > > rather then just sitting there ?
> > 
> > I did some testing. Unconditionally taking the error path without waiting
> > for the semaphore crashes or freezes the machine in about 3/4 of all
> > tests.
> > As I said, with a timeout of 500ms, this issue is not seen.
> 
> Ah ok, so that patch is enough to fix this ? Then as I already

Yes, it is enough.

> said I'm fine with that patch, needing long timeouts unfortunately
> seems normal when dealing with embedded micro-controllers, I've
> seen the same with e.g. ACPI embedded-controllers.

I am quite uncomfortable with code in the kernel that will crash
the machine if it ever runs. Yet I am also uncomfortable with code
that would run forever.

> > Do you have reliable information that the error handling works
> > on BayTrail?
> 
> No, Bay Trail actually is known to not always be stable with Linux.

So this code is best effort just in case?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 14:05         ` Oliver Neukum
@ 2017-03-21 14:48           ` Hans de Goede
  2017-03-21 15:37             ` Oliver Neukum
  0 siblings, 1 reply; 10+ messages in thread
From: Hans de Goede @ 2017-03-21 14:48 UTC (permalink / raw)
  To: Oliver Neukum, Max Staudt; +Cc: linux-i2c

Hi,

On 21-03-17 15:05, Oliver Neukum wrote:
> Am Dienstag, den 21.03.2017, 14:55 +0100 schrieb Hans de Goede:
>> Hi,
>>
>> On 21-03-17 14:36, Oliver Neukum wrote:
>>>
>>> Am Dienstag, den 21.03.2017, 14:01 +0100 schrieb Hans de Goede:
>>>>
>>>> Hi,
>>>>
>>>> On 21-03-17 13:57, Max Staudt wrote:
>
> Hello,
>
>>>>> In other words, whether we should rather wait for lock acquisition,
>>>>> unconditionally. No timeout, just wait. That's what our hardware
>>>>> seems to need.
>>>>>
>>>>> It feels like once the lock has been requested by the Linux driver,
>>>>> we can't cancel that request and have to actually follow through
>>>>> with accepting the lock and only giving it back after that.
>>>>> Resetting the "request" bit to 0 as it is done now doesn't work as
>>>>> it leads to the hung system sometime soon after that.
>>>>
>>>> Hmm, interesting theory. I would say give it a test and if it
>>>> helps then maybe increase the timeouts to say 10 seconds or
>>>> such, so that e.g. on poweroff we at least report an error
>>>> rather then just sitting there ?
>>>
>>> I did some testing. Unconditionally taking the error path without waiting
>>> for the semaphore crashes or freezes the machine in about 3/4 of all
>>> tests.
>>> As I said, with a timeout of 500ms, this issue is not seen.
>>
>> Ah ok, so that patch is enough to fix this ? Then as I already
>
> Yes, it is enough.
>
>> said I'm fine with that patch, needing long timeouts unfortunately
>> seems normal when dealing with embedded micro-controllers, I've
>> seen the same with e.g. ACPI embedded-controllers.
>
> I am quite uncomfortable with code in the kernel that will crash
> the machine if it ever runs. Yet I am also uncomfortable with code
> that would run forever.

That is exactly how I feel, I did not realize (yet) that taking
the error path would always cause a freeze later, I've been assuming
that the timeout was caused by the bus already being stuck, not
that the timeout would cause the bus to get stuck because a
semaphore request must be followed through on.

If your theory is right we may well want to bump up the timeout
to say 2 or 3 seconds.

>>> Do you have reliable information that the error handling works
>>> on BayTrail?
>>
>> No, Bay Trail actually is known to not always be stable with Linux.
>
> So this code is best effort just in case?

I would more call it work in progress, people are working on
making Baytrail reliable and this code is part of that effort,
but there are still some bugs lurking somewhere.

Regards,

Hans

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 14:48           ` Hans de Goede
@ 2017-03-21 15:37             ` Oliver Neukum
  2017-03-21 15:40               ` Hans de Goede
  0 siblings, 1 reply; 10+ messages in thread
From: Oliver Neukum @ 2017-03-21 15:37 UTC (permalink / raw)
  To: Hans de Goede, Max Staudt; +Cc: linux-i2c

Am Dienstag, den 21.03.2017, 15:48 +0100 schrieb Hans de Goede:
> Hi,
> 
> On 21-03-17 15:05, Oliver Neukum wrote:

Hi,

> > I am quite uncomfortable with code in the kernel that will crash
> > the machine if it ever runs. Yet I am also uncomfortable with code
> > that would run forever.
> 
> That is exactly how I feel, I did not realize (yet) that taking
> the error path would always cause a freeze later, I've been assuming
> that the timeout was caused by the bus already being stuck, not
> that the timeout would cause the bus to get stuck because a
> semaphore request must be followed through on.

Those options are not mutually exclusive.

> If your theory is right we may well want to bump up the timeout
> to say 2 or 3 seconds.

We never saw a failure with 500ms. That is pretty solid.
Yet a true error would likely have catastrophic results.
It seems to me that the problem is not when to proceed
to error handling, but how the error is handled.
Do you have any documentation on that?
Should we forcibly set the semaphore to the state that signifies
a successful take over?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: entering the error case of i2c-designware with a timeout at probe
  2017-03-21 15:37             ` Oliver Neukum
@ 2017-03-21 15:40               ` Hans de Goede
  0 siblings, 0 replies; 10+ messages in thread
From: Hans de Goede @ 2017-03-21 15:40 UTC (permalink / raw)
  To: Oliver Neukum, Max Staudt; +Cc: linux-i2c

Hi,

On 21-03-17 16:37, Oliver Neukum wrote:
> Am Dienstag, den 21.03.2017, 15:48 +0100 schrieb Hans de Goede:
>> Hi,
>>
>> On 21-03-17 15:05, Oliver Neukum wrote:
>
> Hi,
>
>>> I am quite uncomfortable with code in the kernel that will crash
>>> the machine if it ever runs. Yet I am also uncomfortable with code
>>> that would run forever.
>>
>> That is exactly how I feel, I did not realize (yet) that taking
>> the error path would always cause a freeze later, I've been assuming
>> that the timeout was caused by the bus already being stuck, not
>> that the timeout would cause the bus to get stuck because a
>> semaphore request must be followed through on.
>
> Those options are not mutually exclusive.
>
>> If your theory is right we may well want to bump up the timeout
>> to say 2 or 3 seconds.
>
> We never saw a failure with 500ms. That is pretty solid.
> Yet a true error would likely have catastrophic results.
> It seems to me that the problem is not when to proceed
> to error handling, but how the error is handled.
> Do you have any documentation on that?

Nope I have no documentation at all, just some experience
from poking at the hw.

[hans@shalem linux]$ scripts/get_maintainer.pl -f drivers/i2c/busses/i2c-designware-baytrail.c
Jarkko Nikula <jarkko.nikula@linux.intel.com> (maintainer:SYNOPSYS DESIGNWARE I2C DRIVER)
Andy Shevchenko <andriy.shevchenko@linux.intel.com> (reviewer:SYNOPSYS DESIGNWARE I2C DRIVER)
Mika Westerberg <mika.westerberg@linux.intel.com> (reviewer:SYNOPSYS DESIGNWARE I2C DRIVER)
Wolfram Sang <wsa@the-dreams.de> (maintainer:I2C SUBSYSTEM)
linux-i2c@vger.kernel.org (open list:SYNOPSYS DESIGNWARE I2C DRIVER)
linux-kernel@vger.kernel.org (open list)

There are 3 persons in there from Intel, you may want to
contact them about this.

> Should we forcibly set the semaphore to the state that signifies
> a successful take over?

Regards,

Hans

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-03-21 15:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21 12:52 entering the error case of i2c-designware with a timeout at probe Oliver Neukum
2017-03-21 12:57 ` Max Staudt
2017-03-21 13:01   ` Hans de Goede
2017-03-21 13:36     ` Oliver Neukum
2017-03-21 13:55       ` Hans de Goede
2017-03-21 14:05         ` Oliver Neukum
2017-03-21 14:48           ` Hans de Goede
2017-03-21 15:37             ` Oliver Neukum
2017-03-21 15:40               ` Hans de Goede
2017-03-21 13:00 ` Hans de Goede

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.