All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brijesh Singh <brijeshkumar.singh@amd.com>
To: Andre Przywara <andre.przywara@arm.com>,
	Hanjun Guo <guohanjun@huawei.com>,
	<linux-arm-kernel@lists.infradead.org>
Cc: <brijeshkumar.singh@amd.com>, <robh+dt@kernel.org>,
	<pawel.moll@arm.com>, <mark.rutland@arm.com>,
	<ijc+devicetree@hellion.org.uk>, <galak@codeaurora.org>,
	<dougthompson@xmission.com>, <bp@alien8.de>,
	<mchehab@osg.samsung.com>, <devicetree@vger.kernel.org>,
	<arnd@arndb.de>, <linux-kernel@vger.kernel.org>,
	<linux-edac@vger.kernel.org>, Huxinwei <huxinwei@huawei.com>
Subject: Re: [PATCH v2] EDAC: Add ARM64 EDAC
Date: Fri, 23 Oct 2015 12:58:29 -0500	[thread overview]
Message-ID: <562A7545.2050306@amd.com> (raw)
In-Reply-To: <562A033A.5050105@arm.com>


> So I checked the x86 code: the driver is always loaded as soon as the
> hardware is there (looking at PCI device IDs from the on-chip
> northbridge, for instance). The trick here is to have the Kconfig option
> defaulting to "=n", so a kernel builder would have to explicitly enable
> this. Android or embedded kernels wouldn't do this, for instance, while
> a server distribution would do.
> If a user doesn't want to be bothered with the driver, there is always
> the possibility of blacklisting the module.
> Setting a system policy is IMHO out of scope for a DT binding.
> 
Will update Kconfig to make it n by default.

>>> * Its possible that other SoC's might handle single-bit and double-bit errors differently compare to 
>>>   Seattle platform. In Seattle platform both errors are handled by firmware but if other SoC 
>>>   wants OS to handle these errors then they might need DT binding to provide the irq numbers etc.
> 
> What do you mean exactly with "firmware handles these errors"?
> What would the firmware do? I guess just logging the error and then
> possibly reset the register? How would this change the driver?
> 
On Seattle platform SoC generates a interrupt on both single bit and double bit error
and that interrupt is handled by firmware, so we don't need to do anything in the driver.
Driver just need to poll registers to log correctable errors (because they do not generate interrupt).
This very driver is doing exactly what we want. DT binding is not required.

But Hanjun's comment on very first patch hinted me that there is possibility that
SoC generate a interrupt on single bit and double bit but firmware does not handle it.
In those cases driver will need be extended to handle interrupt.

I will submit v3 for review with DT binding removed. We can revisit DT binding need in future.

>> I totally agree with you here,  thanks for putting them together.
>> Different SoCs may handle the error in different ways, we need
>> bindings to specialize them, irq number is a good example :)
> 
> But how does this affect this very driver, polling just those two registers?
> Where would the interrupt come into the game here? Where is the proposed
> DT binding for that interrupt?
> 
> AFAICT EL3 firmware handling errors would just hide this information
> from the driver, so if the f/w decides to "handle" uncorrectable ECC
> errors in a fatal way, there is nothing the driver could do anyway, right?
> 
> Can you sketch a concrete example where we would actually need the
> driver to know about the firmware capabilities?
> 
> Cheers,
> Andre.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

WARNING: multiple messages have this Message-ID (diff)
From: Brijesh Singh <brijeshkumar.singh@amd.com>
To: Andre Przywara <andre.przywara@arm.com>,
	Hanjun Guo <guohanjun@huawei.com>,
	linux-arm-kernel@lists.infradead.org
Cc: brijeshkumar.singh@amd.com, robh+dt@kernel.org,
	pawel.moll@arm.com, mark.rutland@arm.com,
	ijc+devicetree@hellion.org.uk, galak@codeaurora.org,
	dougthompson@xmission.com, bp@alien8.de, mchehab@osg.samsung.com,
	devicetree@vger.kernel.org, arnd@arndb.de,
	linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org,
	Huxinwei <huxinwei@huawei.com>
Subject: Re: [PATCH v2] EDAC: Add ARM64 EDAC
Date: Fri, 23 Oct 2015 12:58:29 -0500	[thread overview]
Message-ID: <562A7545.2050306@amd.com> (raw)
In-Reply-To: <562A033A.5050105@arm.com>


> So I checked the x86 code: the driver is always loaded as soon as the
> hardware is there (looking at PCI device IDs from the on-chip
> northbridge, for instance). The trick here is to have the Kconfig option
> defaulting to "=n", so a kernel builder would have to explicitly enable
> this. Android or embedded kernels wouldn't do this, for instance, while
> a server distribution would do.
> If a user doesn't want to be bothered with the driver, there is always
> the possibility of blacklisting the module.
> Setting a system policy is IMHO out of scope for a DT binding.
> 
Will update Kconfig to make it n by default.

>>> * Its possible that other SoC's might handle single-bit and double-bit errors differently compare to 
>>>   Seattle platform. In Seattle platform both errors are handled by firmware but if other SoC 
>>>   wants OS to handle these errors then they might need DT binding to provide the irq numbers etc.
> 
> What do you mean exactly with "firmware handles these errors"?
> What would the firmware do? I guess just logging the error and then
> possibly reset the register? How would this change the driver?
> 
On Seattle platform SoC generates a interrupt on both single bit and double bit error
and that interrupt is handled by firmware, so we don't need to do anything in the driver.
Driver just need to poll registers to log correctable errors (because they do not generate interrupt).
This very driver is doing exactly what we want. DT binding is not required.

But Hanjun's comment on very first patch hinted me that there is possibility that
SoC generate a interrupt on single bit and double bit but firmware does not handle it.
In those cases driver will need be extended to handle interrupt.

I will submit v3 for review with DT binding removed. We can revisit DT binding need in future.

>> I totally agree with you here,  thanks for putting them together.
>> Different SoCs may handle the error in different ways, we need
>> bindings to specialize them, irq number is a good example :)
> 
> But how does this affect this very driver, polling just those two registers?
> Where would the interrupt come into the game here? Where is the proposed
> DT binding for that interrupt?
> 
> AFAICT EL3 firmware handling errors would just hide this information
> from the driver, so if the f/w decides to "handle" uncorrectable ECC
> errors in a fatal way, there is nothing the driver could do anyway, right?
> 
> Can you sketch a concrete example where we would actually need the
> driver to know about the firmware capabilities?
> 
> Cheers,
> Andre.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

WARNING: multiple messages have this Message-ID (diff)
From: brijeshkumar.singh@amd.com (Brijesh Singh)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] EDAC: Add ARM64 EDAC
Date: Fri, 23 Oct 2015 12:58:29 -0500	[thread overview]
Message-ID: <562A7545.2050306@amd.com> (raw)
In-Reply-To: <562A033A.5050105@arm.com>


> So I checked the x86 code: the driver is always loaded as soon as the
> hardware is there (looking at PCI device IDs from the on-chip
> northbridge, for instance). The trick here is to have the Kconfig option
> defaulting to "=n", so a kernel builder would have to explicitly enable
> this. Android or embedded kernels wouldn't do this, for instance, while
> a server distribution would do.
> If a user doesn't want to be bothered with the driver, there is always
> the possibility of blacklisting the module.
> Setting a system policy is IMHO out of scope for a DT binding.
> 
Will update Kconfig to make it n by default.

>>> * Its possible that other SoC's might handle single-bit and double-bit errors differently compare to 
>>>   Seattle platform. In Seattle platform both errors are handled by firmware but if other SoC 
>>>   wants OS to handle these errors then they might need DT binding to provide the irq numbers etc.
> 
> What do you mean exactly with "firmware handles these errors"?
> What would the firmware do? I guess just logging the error and then
> possibly reset the register? How would this change the driver?
> 
On Seattle platform SoC generates a interrupt on both single bit and double bit error
and that interrupt is handled by firmware, so we don't need to do anything in the driver.
Driver just need to poll registers to log correctable errors (because they do not generate interrupt).
This very driver is doing exactly what we want. DT binding is not required.

But Hanjun's comment on very first patch hinted me that there is possibility that
SoC generate a interrupt on single bit and double bit but firmware does not handle it.
In those cases driver will need be extended to handle interrupt.

I will submit v3 for review with DT binding removed. We can revisit DT binding need in future.

>> I totally agree with you here,  thanks for putting them together.
>> Different SoCs may handle the error in different ways, we need
>> bindings to specialize them, irq number is a good example :)
> 
> But how does this affect this very driver, polling just those two registers?
> Where would the interrupt come into the game here? Where is the proposed
> DT binding for that interrupt?
> 
> AFAICT EL3 firmware handling errors would just hide this information
> from the driver, so if the f/w decides to "handle" uncorrectable ECC
> errors in a fatal way, there is nothing the driver could do anyway, right?
> 
> Can you sketch a concrete example where we would actually need the
> driver to know about the firmware capabilities?
> 
> Cheers,
> Andre.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

  reply	other threads:[~2015-10-23 17:59 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-21 20:41 [PATCH v2] EDAC: Add ARM64 EDAC Brijesh Singh
2015-10-21 20:41 ` Brijesh Singh
2015-10-21 20:41 ` Brijesh Singh
2015-10-21 21:25 ` Mauro Carvalho Chehab
2015-10-21 21:25   ` Mauro Carvalho Chehab
2015-10-21 21:25   ` Mauro Carvalho Chehab
2015-10-22 18:47   ` Brijesh Singh
2015-10-22 18:47     ` Brijesh Singh
2015-10-22 18:47     ` Brijesh Singh
2015-10-21 23:52 ` Andre Przywara
2015-10-21 23:52   ` Andre Przywara
2015-10-21 23:52   ` Andre Przywara
2015-10-22 14:46   ` Brijesh Singh
2015-10-22 14:46     ` Brijesh Singh
2015-10-22 14:46     ` Brijesh Singh
2015-10-23  1:41     ` Hanjun Guo
2015-10-23  1:41       ` Hanjun Guo
2015-10-23  1:41       ` Hanjun Guo
2015-10-23  9:51       ` Andre Przywara
2015-10-23  9:51         ` Andre Przywara
2015-10-23  9:51         ` Andre Przywara
2015-10-23 17:58         ` Brijesh Singh [this message]
2015-10-23 17:58           ` Brijesh Singh
2015-10-23 17:58           ` Brijesh Singh
2015-10-24  2:36           ` Hanjun Guo
2015-10-24  2:36             ` Hanjun Guo
2015-10-24  2:36             ` Hanjun Guo
2015-10-23 16:58 ` Stephen Boyd
2015-10-23 16:58   ` Stephen Boyd
2015-10-26 12:46 ` Mark Rutland
2015-10-26 12:46   ` Mark Rutland
2015-10-26 12:46   ` Mark Rutland
2015-10-23  1:51 Stepan Moskovchenko
2015-10-23  3:07 ` Singh, Brijeshkumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=562A7545.2050306@amd.com \
    --to=brijeshkumar.singh@amd.com \
    --cc=andre.przywara@arm.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=devicetree@vger.kernel.org \
    --cc=dougthompson@xmission.com \
    --cc=galak@codeaurora.org \
    --cc=guohanjun@huawei.com \
    --cc=huxinwei@huawei.com \
    --cc=ijc+devicetree@hellion.org.uk \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mchehab@osg.samsung.com \
    --cc=pawel.moll@arm.com \
    --cc=robh+dt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.