All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nfit: add a module parameter to ignore ars errors
@ 2016-01-04 22:34 Vishal Verma
  2016-01-06 17:12 ` Linda Knippers
  0 siblings, 1 reply; 6+ messages in thread
From: Vishal Verma @ 2016-01-04 22:34 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Vishal Verma, linux-acpi, Dan Williams, Jeff Moyer, Linda Knippers

Normally, if a platform does not advertise support for Address Range
Scrub (ARS), we skip it. But if ARS is advertised, it is expected to
always succeed. If it fails, we normally fail initialization at that
point.

Add a module parameter to nfit that lets it ignore ARS failures and
continue with initialization for debugging.

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---

This applies on top of both of the previous error handling series
(badblocks and libnvdimm poison list). The tree at:
https://git.kernel.org/cgit/linux/kernel/git/vishal/nvdimm.git/log/?h=err_handling_latest
has been updated with this patch.

 drivers/acpi/nfit.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index ad6d8c6..0a152f1 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -34,6 +34,10 @@ static bool force_enable_dimms;
 module_param(force_enable_dimms, bool, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(force_enable_dimms, "Ignore _STA (ACPI DIMM device) status");
 
+static bool ignore_ars;
+module_param(ignore_ars, bool, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(ignore_ars, "Ignore ARS (Address Range Scrub) failures");
+
 struct nfit_table_prev {
 	struct list_head spas;
 	struct list_head memdevs;
@@ -1786,7 +1790,10 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
 			dev_err(acpi_desc->dev,
 				"error while performing ARS to find poison: %d\n",
 				rc);
-			return rc;
+			if (ignore_ars)
+				; /* continue initialization */
+			else
+				return rc;
 		}
 		if (!nvdimm_pmem_region_create(nvdimm_bus, ndr_desc))
 			return -ENOMEM;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfit: add a module parameter to ignore ars errors
  2016-01-04 22:34 [PATCH] nfit: add a module parameter to ignore ars errors Vishal Verma
@ 2016-01-06 17:12 ` Linda Knippers
  2016-01-07  3:01   ` Vishal Verma
  0 siblings, 1 reply; 6+ messages in thread
From: Linda Knippers @ 2016-01-06 17:12 UTC (permalink / raw)
  To: Vishal Verma, linux-nvdimm; +Cc: linux-acpi, Dan Williams, Jeff Moyer

On 1/4/2016 5:34 PM, Vishal Verma wrote:
> Normally, if a platform does not advertise support for Address Range
> Scrub (ARS), we skip it. But if ARS is advertised, it is expected to
> always succeed. If it fails, we normally fail initialization at that
> point.
> 
> Add a module parameter to nfit that lets it ignore ARS failures and
> continue with initialization for debugging.

Could ARS be so broken that you might want to just ignore it altogether
and not even make the requests?

-- ljk
> 
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
> 
> This applies on top of both of the previous error handling series
> (badblocks and libnvdimm poison list). The tree at:
> https://git.kernel.org/cgit/linux/kernel/git/vishal/nvdimm.git/log/?h=err_handling_latest
> has been updated with this patch.
> 
>  drivers/acpi/nfit.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
> index ad6d8c6..0a152f1 100644
> --- a/drivers/acpi/nfit.c
> +++ b/drivers/acpi/nfit.c
> @@ -34,6 +34,10 @@ static bool force_enable_dimms;
>  module_param(force_enable_dimms, bool, S_IRUGO|S_IWUSR);
>  MODULE_PARM_DESC(force_enable_dimms, "Ignore _STA (ACPI DIMM device) status");
>  
> +static bool ignore_ars;
> +module_param(ignore_ars, bool, S_IRUGO|S_IWUSR);
> +MODULE_PARM_DESC(ignore_ars, "Ignore ARS (Address Range Scrub) failures");
> +
>  struct nfit_table_prev {
>  	struct list_head spas;
>  	struct list_head memdevs;
> @@ -1786,7 +1790,10 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc,
>  			dev_err(acpi_desc->dev,
>  				"error while performing ARS to find poison: %d\n",
>  				rc);
> -			return rc;
> +			if (ignore_ars)
> +				; /* continue initialization */
> +			else
> +				return rc;
>  		}
>  		if (!nvdimm_pmem_region_create(nvdimm_bus, ndr_desc))
>  			return -ENOMEM;
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfit: add a module parameter to ignore ars errors
  2016-01-06 17:12 ` Linda Knippers
@ 2016-01-07  3:01   ` Vishal Verma
  2016-01-07  5:34     ` Dan Williams
  0 siblings, 1 reply; 6+ messages in thread
From: Vishal Verma @ 2016-01-07  3:01 UTC (permalink / raw)
  To: Linda Knippers, Vishal Verma, linux-nvdimm; +Cc: linux-acpi

On Wed, 2016-01-06 at 12:12 -0500, Linda Knippers wrote:
> On 1/4/2016 5:34 PM, Vishal Verma wrote:
> > Normally, if a platform does not advertise support for Address
> > Range
> > Scrub (ARS), we skip it. But if ARS is advertised, it is expected
> > to
> > always succeed. If it fails, we normally fail initialization at
> > that
> > point.
> > 
> > Add a module parameter to nfit that lets it ignore ARS failures and
> > continue with initialization for debugging.
> 
> Could ARS be so broken that you might want to just ignore it
> altogether
> and not even make the requests?
> 

That is a possibility, and I considered it, but I thought it might be
better to see how it fails and then just ignore the errors..
It boils down to how much we trust the firmware, and hopefully if it
advertises ARS as implemented, it should not be completely broken..

Dan, thoughts?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfit: add a module parameter to ignore ars errors
  2016-01-07  3:01   ` Vishal Verma
@ 2016-01-07  5:34     ` Dan Williams
  2016-01-07 21:31       ` Linda Knippers
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Williams @ 2016-01-07  5:34 UTC (permalink / raw)
  To: Vishal Verma; +Cc: Linda Knippers, Vishal Verma, linux-nvdimm, Linux ACPI

On Wed, Jan 6, 2016 at 7:01 PM, Vishal Verma <vishal@kernel.org> wrote:
> On Wed, 2016-01-06 at 12:12 -0500, Linda Knippers wrote:
>> On 1/4/2016 5:34 PM, Vishal Verma wrote:
>> > Normally, if a platform does not advertise support for Address
>> > Range
>> > Scrub (ARS), we skip it. But if ARS is advertised, it is expected
>> > to
>> > always succeed. If it fails, we normally fail initialization at
>> > that
>> > point.
>> >
>> > Add a module parameter to nfit that lets it ignore ARS failures and
>> > continue with initialization for debugging.
>>
>> Could ARS be so broken that you might want to just ignore it
>> altogether
>> and not even make the requests?
>>
>
> That is a possibility, and I considered it, but I thought it might be
> better to see how it fails and then just ignore the errors..
> It boils down to how much we trust the firmware, and hopefully if it
> advertises ARS as implemented, it should not be completely broken..
>
> Dan, thoughts?
>

Hmm, once we add plumbing for bad block clearing / setting we'll have
the tools to workaround firmware with untrusted ars results.  i.e.
just manually correct false positive / negative entries.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfit: add a module parameter to ignore ars errors
  2016-01-07  5:34     ` Dan Williams
@ 2016-01-07 21:31       ` Linda Knippers
  2016-01-07 22:07         ` Dan Williams
  0 siblings, 1 reply; 6+ messages in thread
From: Linda Knippers @ 2016-01-07 21:31 UTC (permalink / raw)
  To: Dan Williams, Vishal Verma; +Cc: Vishal Verma, linux-nvdimm, Linux ACPI

On 1/7/2016 12:34 AM, Dan Williams wrote:
> On Wed, Jan 6, 2016 at 7:01 PM, Vishal Verma <vishal@kernel.org> wrote:
>> On Wed, 2016-01-06 at 12:12 -0500, Linda Knippers wrote:
>>> On 1/4/2016 5:34 PM, Vishal Verma wrote:
>>>> Normally, if a platform does not advertise support for Address
>>>> Range
>>>> Scrub (ARS), we skip it. But if ARS is advertised, it is expected
>>>> to
>>>> always succeed. If it fails, we normally fail initialization at
>>>> that
>>>> point.
>>>>
>>>> Add a module parameter to nfit that lets it ignore ARS failures and
>>>> continue with initialization for debugging.
>>>
>>> Could ARS be so broken that you might want to just ignore it
>>> altogether
>>> and not even make the requests?
>>>
>>
>> That is a possibility, and I considered it, but I thought it might be
>> better to see how it fails and then just ignore the errors..
>> It boils down to how much we trust the firmware, and hopefully if it
>> advertises ARS as implemented, it should not be completely broken..
>>
>> Dan, thoughts?
>>
> 
> Hmm, once we add plumbing for bad block clearing / setting we'll have
> the tools to workaround firmware with untrusted ars results.  i.e.
> just manually correct false positive / negative entries.

I was more worried about places where the code is looping waiting
for commands to complete and what happens with buggy firmware
but I've now commented on that patch.  Related to the parameter,
if we think we need to account for buggy firmware, we could be
vulnerable in more places this.

-- ljk

> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfit: add a module parameter to ignore ars errors
  2016-01-07 21:31       ` Linda Knippers
@ 2016-01-07 22:07         ` Dan Williams
  0 siblings, 0 replies; 6+ messages in thread
From: Dan Williams @ 2016-01-07 22:07 UTC (permalink / raw)
  To: Linda Knippers; +Cc: Vishal Verma, Vishal Verma, linux-nvdimm, Linux ACPI

On Thu, Jan 7, 2016 at 1:31 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 1/7/2016 12:34 AM, Dan Williams wrote:
>> On Wed, Jan 6, 2016 at 7:01 PM, Vishal Verma <vishal@kernel.org> wrote:
>>> On Wed, 2016-01-06 at 12:12 -0500, Linda Knippers wrote:
>>>> On 1/4/2016 5:34 PM, Vishal Verma wrote:
>>>>> Normally, if a platform does not advertise support for Address
>>>>> Range
>>>>> Scrub (ARS), we skip it. But if ARS is advertised, it is expected
>>>>> to
>>>>> always succeed. If it fails, we normally fail initialization at
>>>>> that
>>>>> point.
>>>>>
>>>>> Add a module parameter to nfit that lets it ignore ARS failures and
>>>>> continue with initialization for debugging.
>>>>
>>>> Could ARS be so broken that you might want to just ignore it
>>>> altogether
>>>> and not even make the requests?
>>>>
>>>
>>> That is a possibility, and I considered it, but I thought it might be
>>> better to see how it fails and then just ignore the errors..
>>> It boils down to how much we trust the firmware, and hopefully if it
>>> advertises ARS as implemented, it should not be completely broken..
>>>
>>> Dan, thoughts?
>>>
>>
>> Hmm, once we add plumbing for bad block clearing / setting we'll have
>> the tools to workaround firmware with untrusted ars results.  i.e.
>> just manually correct false positive / negative entries.
>
> I was more worried about places where the code is looping waiting
> for commands to complete and what happens with buggy firmware
> but I've now commented on that patch.  Related to the parameter,
> if we think we need to account for buggy firmware, we could be
> vulnerable in more places this.

Yes, lets wait and see rather than pre-emptively working around
potentially bad firmware.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-01-07 22:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-04 22:34 [PATCH] nfit: add a module parameter to ignore ars errors Vishal Verma
2016-01-06 17:12 ` Linda Knippers
2016-01-07  3:01   ` Vishal Verma
2016-01-07  5:34     ` Dan Williams
2016-01-07 21:31       ` Linda Knippers
2016-01-07 22:07         ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.