All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] apci, nfit: DSM improvements
@ 2017-02-13 16:27 Linda Knippers
  2017-02-13 16:27 ` [PATCH 1/3] Allow all supported HPE DSM functions to be called Linda Knippers
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Linda Knippers @ 2017-02-13 16:27 UTC (permalink / raw)
  To: linux-nvdimm, dan.j.williams

The first two patches in this series are motivated by feedback
from distribution developers, test engineers, and management tool
developers.  We need the ability to test and support Linux and 
management tools on systems that support more than one DSM family
for NVDIMM-N.  

We also need the ability to test and support new functions without
requiring users to update their kernel.  These changes will also
facilate development as we move toward DSM standardization.
The ability to restrict DSM functions to the currently documented
set is still in place.

The third patch cleans up a nit.

Linda Knippers (3):
  Allow all supported HPE DSM functions to be called
  Allow specifying a default DSM family
  Remove unnecessary newline

 drivers/acpi/nfit/core.c | 34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

-- 
1.8.3.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-13 16:27 [PATCH 0/3] apci, nfit: DSM improvements Linda Knippers
@ 2017-02-13 16:27 ` Linda Knippers
  2017-02-13 17:28   ` Dan Williams
  2017-02-13 16:27 ` [PATCH 2/3] Allow specifying a default DSM family Linda Knippers
  2017-02-13 16:27 ` [PATCH 3/3] Remove unnecessary newline Linda Knippers
  2 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-13 16:27 UTC (permalink / raw)
  To: linux-nvdimm, dan.j.williams

As it is today, we can't enable or test new functions in firmware without
changing the kernel.

With this patch we allow function 0 for the HPE DSM families,
as is already allowed with the MS family. We now only restrict
the functions to the currently documented set if the module parameter
"disable_vendor_specific" is set.  Since the module parameter
description says "Limit commands to the publicly specified set",
this approach seems correct.

Signed-off-by: Linda Knippers <linda.knippers@hpe.com>
---
 drivers/acpi/nfit/core.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 7361d00..b80d4d9 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1407,11 +1407,13 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
 		if (disable_vendor_specific)
 			dsm_mask &= ~(1 << ND_CMD_VENDOR);
 	} else if (nfit_mem->family == NVDIMM_FAMILY_HPE1) {
-		dsm_mask = 0x1c3c76;
+		dsm_mask = 0xffffffff;
+		if (disable_vendor_specific)
+			dsm_mask &= ~(0x1c3c77);
 	} else if (nfit_mem->family == NVDIMM_FAMILY_HPE2) {
-		dsm_mask = 0x1fe;
+		dsm_mask = 0xffffffff;
 		if (disable_vendor_specific)
-			dsm_mask &= ~(1 << 8);
+			dsm_mask &= ~(0x0ff);
 	} else if (nfit_mem->family == NVDIMM_FAMILY_MSFT) {
 		dsm_mask = 0xffffffff;
 	} else {
-- 
1.8.3.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/3] Allow specifying a default DSM family
  2017-02-13 16:27 [PATCH 0/3] apci, nfit: DSM improvements Linda Knippers
  2017-02-13 16:27 ` [PATCH 1/3] Allow all supported HPE DSM functions to be called Linda Knippers
@ 2017-02-13 16:27 ` Linda Knippers
  2017-02-13 16:27 ` [PATCH 3/3] Remove unnecessary newline Linda Knippers
  2 siblings, 0 replies; 25+ messages in thread
From: Linda Knippers @ 2017-02-13 16:27 UTC (permalink / raw)
  To: linux-nvdimm, dan.j.williams

Provide the ability to request a default DSM family. If it is not
supported, then fall back to the normal discovery order.

This is helpful for testing platforms that support multiple DSM families.
It will also allow administrators to request the DSM family that their
management tools support, which may not be the first one found using
the current discovery order.

Signed-off-by: Linda Knippers <linda.knippers@hpe.com>
---
 drivers/acpi/nfit/core.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index b80d4d9..e38773f 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -51,6 +51,12 @@
 MODULE_PARM_DESC(disable_vendor_specific,
 		"Limit commands to the publicly specified set\n");
 
+static int default_dsm_family = -1;
+module_param(default_dsm_family, int, S_IRUGO);
+MODULE_PARM_DESC(default_dsm_family,
+		"Try this DSM type first when identifying NVDIMM family");
+
+
 LIST_HEAD(acpi_descs);
 DEFINE_MUTEX(acpi_desc_lock);
 
@@ -1367,7 +1373,7 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
 	struct device *dev = acpi_desc->dev;
 	unsigned long dsm_mask;
 	const u8 *uuid;
-	int i;
+	int i = -1;
 
 	/* nfit test assumes 1:1 relationship between commands and dsms */
 	nfit_mem->dsm_mask = acpi_desc->dimm_cmd_force_en;
@@ -1395,10 +1401,23 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
 	 * Until standardization materializes we need to consider 4
 	 * different command sets.  Note, that checking for function0 (bit0)
 	 * tells us if any commands are reachable through this uuid.
+	 * First check for a requested default.
 	 */
-	for (i = NVDIMM_FAMILY_INTEL; i <= NVDIMM_FAMILY_MSFT; i++)
-		if (acpi_check_dsm(adev_dimm->handle, to_nfit_uuid(i), 1, 1))
-			break;
+	if (default_dsm_family >= NVDIMM_FAMILY_INTEL &&
+			default_dsm_family <= NVDIMM_FAMILY_MSFT) {
+		if (acpi_check_dsm(adev_dimm->handle,
+				to_nfit_uuid(default_dsm_family), 1, 1))
+			i = default_dsm_family;
+		else
+			dev_dbg(dev, "default_dsm_family %d not supported\n",
+				default_dsm_family);
+	}
+	if (i == -1) {
+		for (i = NVDIMM_FAMILY_INTEL; i <= NVDIMM_FAMILY_MSFT; i++)
+			if (acpi_check_dsm(adev_dimm->handle, to_nfit_uuid(i),
+					1, 1))
+				break;
+	}
 
 	/* limit the supported commands to those that are publicly documented */
 	nfit_mem->family = i;
-- 
1.8.3.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/3] Remove unnecessary newline
  2017-02-13 16:27 [PATCH 0/3] apci, nfit: DSM improvements Linda Knippers
  2017-02-13 16:27 ` [PATCH 1/3] Allow all supported HPE DSM functions to be called Linda Knippers
  2017-02-13 16:27 ` [PATCH 2/3] Allow specifying a default DSM family Linda Knippers
@ 2017-02-13 16:27 ` Linda Knippers
  2017-02-13 17:32   ` Dan Williams
  2 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-13 16:27 UTC (permalink / raw)
  To: linux-nvdimm, dan.j.williams

Newline in MODULE_PARM_DESC messes up the modprobe output so remove it.

Signed-off-by: Linda Knippers <linda.knippers@hpe.com>
---
 drivers/acpi/nfit/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index e38773f..6c522c8 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -49,7 +49,7 @@
 static bool disable_vendor_specific;
 module_param(disable_vendor_specific, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_vendor_specific,
-		"Limit commands to the publicly specified set\n");
+		"Limit commands to the publicly specified set");
 
 static int default_dsm_family = -1;
 module_param(default_dsm_family, int, S_IRUGO);
-- 
1.8.3.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-13 16:27 ` [PATCH 1/3] Allow all supported HPE DSM functions to be called Linda Knippers
@ 2017-02-13 17:28   ` Dan Williams
  2017-02-13 18:17     ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-13 17:28 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
> As it is today, we can't enable or test new functions in firmware without
> changing the kernel.
>
> With this patch we allow function 0 for the HPE DSM families,
> as is already allowed with the MS family. We now only restrict
> the functions to the currently documented set if the module parameter
> "disable_vendor_specific" is set.  Since the module parameter
> description says "Limit commands to the publicly specified set",
> this approach seems correct.
>

My concern is that this makes the kernel less strict by default. Given
this is only a test capability lets add a module option to override
the default dsm_mask. That would also seem to combine well with the
new 'default_dsm_family' option.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] Remove unnecessary newline
  2017-02-13 16:27 ` [PATCH 3/3] Remove unnecessary newline Linda Knippers
@ 2017-02-13 17:32   ` Dan Williams
  2017-02-13 18:02     ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-13 17:32 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
> Newline in MODULE_PARM_DESC messes up the modprobe output so remove it.

Yes, looks good. When this "messes up" does that mean the output less
pretty, or do you know if it breaks scripts parsing the output? Just
want to be clear on the effects of this change.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] Remove unnecessary newline
  2017-02-13 17:32   ` Dan Williams
@ 2017-02-13 18:02     ` Linda Knippers
  0 siblings, 0 replies; 25+ messages in thread
From: Linda Knippers @ 2017-02-13 18:02 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On 02/13/2017 12:32 PM, Dan Williams wrote:
> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> Newline in MODULE_PARM_DESC messes up the modprobe output so remove it.
> 
> Yes, looks good. When this "messes up" does that mean the output less
> pretty, or do you know if it breaks scripts parsing the output? Just
> want to be clear on the effects of this change.

If you do a 'modinfo nfit', it puts "(bool)" on a separate line.
That's unlike all the other parameters.  If someone is parsing the output
of modinfo, they may not get the type.

Without my patch, you get this:

parm:           force_enable_dimms:Ignore _STA (ACPI DIMM device) status (bool)
parm:           scrub_timeout:Initial scrub timeout in seconds (uint)
parm:           scrub_overflow_abort:Number of times we overflow ARS results before abort (uint)
parm:           disable_vendor_specific:Limit commands to the publicly specified set
(bool)
parm:           default_dsm_family:Try this DSM type first when identifying NVDIMM family (int)

instead of this:

parm:           force_enable_dimms:Ignore _STA (ACPI DIMM device) status (bool)
parm:           scrub_timeout:Initial scrub timeout in seconds (uint)
parm:           scrub_overflow_abort:Number of times we overflow ARS results before abort (uint)
parm:           disable_vendor_specific:Limit commands to the publicly specified set (bool)
parm:           default_dsm_family:Try this DSM type first when identifying NVDIMM family (int)


-- ljk
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-13 17:28   ` Dan Williams
@ 2017-02-13 18:17     ` Linda Knippers
  2017-02-13 18:30       ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-13 18:17 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On 02/13/2017 12:28 PM, Dan Williams wrote:
> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> As it is today, we can't enable or test new functions in firmware without
>> changing the kernel.
>>
>> With this patch we allow function 0 for the HPE DSM families,
>> as is already allowed with the MS family. We now only restrict
>> the functions to the currently documented set if the module parameter
>> "disable_vendor_specific" is set.  Since the module parameter
>> description says "Limit commands to the publicly specified set",
>> this approach seems correct.
>>
> 
> My concern is that this makes the kernel less strict by default. Given
> this is only a test capability lets add a module option to override
> the default dsm_mask. 

This part isn't strictly a test capability.  It's also to allow older
kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
We shouldn't need a customer to update their production kernel just to support
an NVDIMM management tool.

As for less secure by default, the default is to allow undocumented
functions of the "vendor specific" type.  How is the really different?

-- ljk

> That would also seem to combine well with the
> new 'default_dsm_family' option.


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-13 18:17     ` Linda Knippers
@ 2017-02-13 18:30       ` Dan Williams
  2017-02-15 23:19         ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-13 18:30 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 02/13/2017 12:28 PM, Dan Williams wrote:
>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> As it is today, we can't enable or test new functions in firmware without
>>> changing the kernel.
>>>
>>> With this patch we allow function 0 for the HPE DSM families,
>>> as is already allowed with the MS family. We now only restrict
>>> the functions to the currently documented set if the module parameter
>>> "disable_vendor_specific" is set.  Since the module parameter
>>> description says "Limit commands to the publicly specified set",
>>> this approach seems correct.
>>>
>>
>> My concern is that this makes the kernel less strict by default. Given
>> this is only a test capability lets add a module option to override
>> the default dsm_mask.
>
> This part isn't strictly a test capability.  It's also to allow older
> kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
> We shouldn't need a customer to update their production kernel just to support
> an NVDIMM management tool.
>
> As for less secure by default, the default is to allow undocumented
> functions of the "vendor specific" type.  How is the really different?
>

It's about pushing back on undocumented BIOS interfaces as much as
possible. The kernel has the documented set enabled by default,
including the documented vendor-specifc function code. There is the
option to disable the vendor-specific tunnel to restrict the kernel to
only allowing commands with publicly documented effects. With a new
"default_dsm_mask" parameter the default kernel policy can be
overridden by the system owner.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-13 18:30       ` Dan Williams
@ 2017-02-15 23:19         ` Linda Knippers
  2017-02-16  0:29           ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-15 23:19 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On 02/13/2017 01:30 PM, Dan Williams wrote:
> On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> On 02/13/2017 12:28 PM, Dan Williams wrote:
>>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>> As it is today, we can't enable or test new functions in firmware without
>>>> changing the kernel.
>>>>
>>>> With this patch we allow function 0 for the HPE DSM families,
>>>> as is already allowed with the MS family. We now only restrict
>>>> the functions to the currently documented set if the module parameter
>>>> "disable_vendor_specific" is set.  Since the module parameter
>>>> description says "Limit commands to the publicly specified set",
>>>> this approach seems correct.
>>>>
>>>
>>> My concern is that this makes the kernel less strict by default. Given
>>> this is only a test capability lets add a module option to override
>>> the default dsm_mask.
>>
>> This part isn't strictly a test capability.  It's also to allow older
>> kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
>> We shouldn't need a customer to update their production kernel just to support
>> an NVDIMM management tool.
>>
>> As for less secure by default, the default is to allow undocumented
>> functions of the "vendor specific" type.  How is the really different?
>>
> 
> It's about pushing back on undocumented BIOS interfaces as much as
> possible. 

I'm not looking to propagate a bunch of undocumented interfaces.  I'm
looking to avoid having distros release new kernels and customers update
their systems because there's a new documented interface, especially when the only
thing that needs to change is the mask.  As you've experienced, the standardization
process takes a long time and I'm anticipating that between now and
eventual standardization, there will be at least one new DSM that's needed.
I'm sure we'll document it too, but that won't help a customer running an
older kernel.

> The kernel has the documented set enabled by default,
> including the documented vendor-specifc function code. 

Ok, so if I update my document to say that undefined values are vendor-specific,
we're good?  Or shall I define the unused values that way explicitly?

> There is the option to disable the vendor-specific tunnel to restrict the kernel to
> only allowing commands with publicly documented effects. 

My patch respects that option.

> With a new "default_dsm_mask" parameter the default kernel policy can be
> overridden by the system owner.

It can, although it means customers would have to add kernel parameters, which
they try to avoid.  I assume that's why the module parameter for a stricter
policy isn't the default since customers using Intel management tools would need to
add a parameter to allow the use of the vendor-specific function.

I'm not trying to be difficult here, but I really don't see the difference
between a documented function with undocumented behavior and an
undocumented function with undocumented behavior.

-- ljk


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-15 23:19         ` Linda Knippers
@ 2017-02-16  0:29           ` Dan Williams
  2017-02-16  0:45             ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-16  0:29 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Wed, Feb 15, 2017 at 3:19 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 02/13/2017 01:30 PM, Dan Williams wrote:
>> On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> On 02/13/2017 12:28 PM, Dan Williams wrote:
>>>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>> As it is today, we can't enable or test new functions in firmware without
>>>>> changing the kernel.
>>>>>
>>>>> With this patch we allow function 0 for the HPE DSM families,
>>>>> as is already allowed with the MS family. We now only restrict
>>>>> the functions to the currently documented set if the module parameter
>>>>> "disable_vendor_specific" is set.  Since the module parameter
>>>>> description says "Limit commands to the publicly specified set",
>>>>> this approach seems correct.
>>>>>
>>>>
>>>> My concern is that this makes the kernel less strict by default. Given
>>>> this is only a test capability lets add a module option to override
>>>> the default dsm_mask.
>>>
>>> This part isn't strictly a test capability.  It's also to allow older
>>> kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
>>> We shouldn't need a customer to update their production kernel just to support
>>> an NVDIMM management tool.
>>>
>>> As for less secure by default, the default is to allow undocumented
>>> functions of the "vendor specific" type.  How is the really different?
>>>
>>
>> It's about pushing back on undocumented BIOS interfaces as much as
>> possible.
>
> I'm not looking to propagate a bunch of undocumented interfaces.  I'm
> looking to avoid having distros release new kernels and customers update
> their systems because there's a new documented interface, especially when the only
> thing that needs to change is the mask.  As you've experienced, the standardization
> process takes a long time and I'm anticipating that between now and
> eventual standardization, there will be at least one new DSM that's needed.
> I'm sure we'll document it too, but that won't help a customer running an
> older kernel.
>
>> The kernel has the documented set enabled by default,
>> including the documented vendor-specifc function code.
>
> Ok, so if I update my document to say that undefined values are vendor-specific,
> we're good?  Or shall I define the unused values that way explicitly?
>
>> There is the option to disable the vendor-specific tunnel to restrict the kernel to
>> only allowing commands with publicly documented effects.
>
> My patch respects that option.
>
>> With a new "default_dsm_mask" parameter the default kernel policy can be
>> overridden by the system owner.
>
> It can, although it means customers would have to add kernel parameters, which
> they try to avoid.  I assume that's why the module parameter for a stricter
> policy isn't the default since customers using Intel management tools would need to
> add a parameter to allow the use of the vendor-specific function.
>
> I'm not trying to be difficult here, but I really don't see the difference
> between a documented function with undocumented behavior and an
> undocumented function with undocumented behavior.

The difference is that it is slightly more painful to require a kernel
module override. Disabling these commands by default has some effect
of pushing back on the creation of new interfaces. Customers are
already prepared to need new kernels for new enabling. It's a similar
coordination we do for new device-ids for drivers, but we still allow
for overriding the default via the "new_id" interface in sysfs.

Look at the reaction we got from the community the last time this
subject was brought up [1]. That reaction stems from the historical
divergence of storage management interfaces requiring Linux to have a
vendor-specific toolkit per device. Requiring a kernel change to make
a command supported by default is a good thing because it affords
ongoing opportunities to find commonalities between vendors and
support common tooling.

I hope the need for the vendor-specific tunnels goes away over time
and Linux can generically support the most common management tasks
with generic infrastructure.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-April/005493.html
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16  0:29           ` Dan Williams
@ 2017-02-16  0:45             ` Linda Knippers
  2017-02-16  2:03               ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-16  0:45 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On 02/15/2017 07:29 PM, Dan Williams wrote:
> On Wed, Feb 15, 2017 at 3:19 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> On 02/13/2017 01:30 PM, Dan Williams wrote:
>>> On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>> On 02/13/2017 12:28 PM, Dan Williams wrote:
>>>>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>>> As it is today, we can't enable or test new functions in firmware without
>>>>>> changing the kernel.
>>>>>>
>>>>>> With this patch we allow function 0 for the HPE DSM families,
>>>>>> as is already allowed with the MS family. We now only restrict
>>>>>> the functions to the currently documented set if the module parameter
>>>>>> "disable_vendor_specific" is set.  Since the module parameter
>>>>>> description says "Limit commands to the publicly specified set",
>>>>>> this approach seems correct.
>>>>>>
>>>>>
>>>>> My concern is that this makes the kernel less strict by default. Given
>>>>> this is only a test capability lets add a module option to override
>>>>> the default dsm_mask.
>>>>
>>>> This part isn't strictly a test capability.  It's also to allow older
>>>> kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
>>>> We shouldn't need a customer to update their production kernel just to support
>>>> an NVDIMM management tool.
>>>>
>>>> As for less secure by default, the default is to allow undocumented
>>>> functions of the "vendor specific" type.  How is the really different?
>>>>
>>>
>>> It's about pushing back on undocumented BIOS interfaces as much as
>>> possible.
>>
>> I'm not looking to propagate a bunch of undocumented interfaces.  I'm
>> looking to avoid having distros release new kernels and customers update
>> their systems because there's a new documented interface, especially when the only
>> thing that needs to change is the mask.  As you've experienced, the standardization
>> process takes a long time and I'm anticipating that between now and
>> eventual standardization, there will be at least one new DSM that's needed.
>> I'm sure we'll document it too, but that won't help a customer running an
>> older kernel.
>>
>>> The kernel has the documented set enabled by default,
>>> including the documented vendor-specifc function code.
>>
>> Ok, so if I update my document to say that undefined values are vendor-specific,
>> we're good?  Or shall I define the unused values that way explicitly?
>>
>>> There is the option to disable the vendor-specific tunnel to restrict the kernel to
>>> only allowing commands with publicly documented effects.
>>
>> My patch respects that option.
>>
>>> With a new "default_dsm_mask" parameter the default kernel policy can be
>>> overridden by the system owner.
>>
>> It can, although it means customers would have to add kernel parameters, which
>> they try to avoid.  I assume that's why the module parameter for a stricter
>> policy isn't the default since customers using Intel management tools would need to
>> add a parameter to allow the use of the vendor-specific function.
>>
>> I'm not trying to be difficult here, but I really don't see the difference
>> between a documented function with undocumented behavior and an
>> undocumented function with undocumented behavior.
> 
> The difference is that it is slightly more painful to require a kernel
> module override. Disabling these commands by default has some effect
> of pushing back on the creation of new interfaces. 

Intel's vendor-specific function, which can expose new interfaces,
requires no override.

> Customers are
> already prepared to need new kernels for new enabling. 

This may not be new enabling.  It could be hardware they already have
but there is a FW update or a management feature or some error recovery
that gets turned on.

> It's a similar coordination we do for new device-ids for drivers, but we still allow
> for overriding the default via the "new_id" interface in sysfs.
> 
> Look at the reaction we got from the community the last time this
> subject was brought up [1]. 

I actually think we handled that objection in the next few messages.

> That reaction stems from the historical
> divergence of storage management interfaces requiring Linux to have a
> vendor-specific toolkit per device. Requiring a kernel change to make
> a command supported by default is a good thing because it affords
> ongoing opportunities to find commonalities between vendors and
> support common tooling.

A good thing for who?  Not for the distro or the customer.  I don't
see how this is different than Intel adding new features through your generic
vendor-specific function, which would not require a kernel spin.

> I hope the need for the vendor-specific tunnels goes away over time
> and Linux can generically support the most common management tasks
> with generic infrastructure.

We totally agree.  In case you're wondering, we don't have a new
DSM queued up just waiting to be exposed by this patch.  However, cases
have come up where we have considered the need for a new DSM function
outside of pure test environments.  So far we have come up with other
approaches but at some point, the right approach will be to add a new DSM
function and I'd really like to not have to spin the kernel and all that
goes along with that when it happens. It's not good for customers.

-- ljk
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2016-April/005493.html
> 

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16  0:45             ` Linda Knippers
@ 2017-02-16  2:03               ` Dan Williams
  2017-02-16 18:51                 ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-16  2:03 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Wed, Feb 15, 2017 at 4:45 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 02/15/2017 07:29 PM, Dan Williams wrote:
>> On Wed, Feb 15, 2017 at 3:19 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> On 02/13/2017 01:30 PM, Dan Williams wrote:
>>>> On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>> On 02/13/2017 12:28 PM, Dan Williams wrote:
>>>>>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>>>> As it is today, we can't enable or test new functions in firmware without
>>>>>>> changing the kernel.
>>>>>>>
>>>>>>> With this patch we allow function 0 for the HPE DSM families,
>>>>>>> as is already allowed with the MS family. We now only restrict
>>>>>>> the functions to the currently documented set if the module parameter
>>>>>>> "disable_vendor_specific" is set.  Since the module parameter
>>>>>>> description says "Limit commands to the publicly specified set",
>>>>>>> this approach seems correct.
>>>>>>>
>>>>>>
>>>>>> My concern is that this makes the kernel less strict by default. Given
>>>>>> this is only a test capability lets add a module option to override
>>>>>> the default dsm_mask.
>>>>>
>>>>> This part isn't strictly a test capability.  It's also to allow older
>>>>> kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
>>>>> We shouldn't need a customer to update their production kernel just to support
>>>>> an NVDIMM management tool.
>>>>>
>>>>> As for less secure by default, the default is to allow undocumented
>>>>> functions of the "vendor specific" type.  How is the really different?
>>>>>
>>>>
>>>> It's about pushing back on undocumented BIOS interfaces as much as
>>>> possible.
>>>
>>> I'm not looking to propagate a bunch of undocumented interfaces.  I'm
>>> looking to avoid having distros release new kernels and customers update
>>> their systems because there's a new documented interface, especially when the only
>>> thing that needs to change is the mask.  As you've experienced, the standardization
>>> process takes a long time and I'm anticipating that between now and
>>> eventual standardization, there will be at least one new DSM that's needed.
>>> I'm sure we'll document it too, but that won't help a customer running an
>>> older kernel.
>>>
>>>> The kernel has the documented set enabled by default,
>>>> including the documented vendor-specifc function code.
>>>
>>> Ok, so if I update my document to say that undefined values are vendor-specific,
>>> we're good?  Or shall I define the unused values that way explicitly?
>>>
>>>> There is the option to disable the vendor-specific tunnel to restrict the kernel to
>>>> only allowing commands with publicly documented effects.
>>>
>>> My patch respects that option.
>>>
>>>> With a new "default_dsm_mask" parameter the default kernel policy can be
>>>> overridden by the system owner.
>>>
>>> It can, although it means customers would have to add kernel parameters, which
>>> they try to avoid.  I assume that's why the module parameter for a stricter
>>> policy isn't the default since customers using Intel management tools would need to
>>> add a parameter to allow the use of the vendor-specific function.
>>>
>>> I'm not trying to be difficult here, but I really don't see the difference
>>> between a documented function with undocumented behavior and an
>>> undocumented function with undocumented behavior.
>>
>> The difference is that it is slightly more painful to require a kernel
>> module override. Disabling these commands by default has some effect
>> of pushing back on the creation of new interfaces.
>
> Intel's vendor-specific function, which can expose new interfaces,
> requires no override.
>

Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
option to disable all vendor specific passthroughs if an environment
only wants to run commands with publicly documented effects.

There's also nothing on the kernel side stopping any vendor from
implementing the "HPE2" or "INTEL" vendor specific interfaces in their
BIOS, the interfaces are not exclusive.

>> Customers are
>> already prepared to need new kernels for new enabling.
>
> This may not be new enabling.  It could be hardware they already have
> but there is a FW update or a management feature or some error recovery
> that gets turned on.
>

I consider new platform functionality hardware or firmware as new enabling.

>> It's a similar coordination we do for new device-ids for drivers, but we still allow
>> for overriding the default via the "new_id" interface in sysfs.
>>
>> Look at the reaction we got from the community the last time this
>> subject was brought up [1].
>
> I actually think we handled that objection in the next few messages.
>

I think we placated it with the strict checking, but the fact that the
kernel continues to have 4 nvdimm-family-types is still a point of
contention.

>> That reaction stems from the historical
>> divergence of storage management interfaces requiring Linux to have a
>> vendor-specific toolkit per device. Requiring a kernel change to make
>> a command supported by default is a good thing because it affords
>> ongoing opportunities to find commonalities between vendors and
>> support common tooling.
>
> A good thing for who?  Not for the distro or the customer.  I don't
> see how this is different than Intel adding new features through your generic
> vendor-specific function, which would not require a kernel spin.

A distro would rather carry one utility that handles multiple vendors.
The kernel would rather carry support code for one generic interface
than multiple interfaces.  A kernel spin is not required with the
proposed module parameter.

>> I hope the need for the vendor-specific tunnels goes away over time
>> and Linux can generically support the most common management tasks
>> with generic infrastructure.
>
> We totally agree.  In case you're wondering, we don't have a new
> DSM queued up just waiting to be exposed by this patch.  However, cases
> have come up where we have considered the need for a new DSM function
> outside of pure test environments.  So far we have come up with other
> approaches but at some point, the right approach will be to add a new DSM
> function and I'd really like to not have to spin the kernel and all that
> goes along with that when it happens. It's not good for customers.
>

I don't understand why a module option is unacceptable when that is
the proposed solution to the kernel picking the wrong "default"
family.

In fact, thinking about it further, I'd be more open to a patch that
allows overriding a DIMM's family via sysfs. That way we can
potentially have cross usage of these different interfaces and it's
less awkward for tooling to use than a module option.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16  2:03               ` Dan Williams
@ 2017-02-16 18:51                 ` Linda Knippers
  2017-02-16 19:35                   ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-16 18:51 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm



On 2/15/2017 9:03 PM, Dan Williams wrote:
> On Wed, Feb 15, 2017 at 4:45 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> On 02/15/2017 07:29 PM, Dan Williams wrote:
>>> On Wed, Feb 15, 2017 at 3:19 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>> On 02/13/2017 01:30 PM, Dan Williams wrote:
>>>>> On Mon, Feb 13, 2017 at 10:17 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>>> On 02/13/2017 12:28 PM, Dan Williams wrote:
>>>>>>> On Mon, Feb 13, 2017 at 8:27 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>>>>> As it is today, we can't enable or test new functions in firmware without
>>>>>>>> changing the kernel.
>>>>>>>>
>>>>>>>> With this patch we allow function 0 for the HPE DSM families,
>>>>>>>> as is already allowed with the MS family. We now only restrict
>>>>>>>> the functions to the currently documented set if the module parameter
>>>>>>>> "disable_vendor_specific" is set.  Since the module parameter
>>>>>>>> description says "Limit commands to the publicly specified set",
>>>>>>>> this approach seems correct.
>>>>>>>>
>>>>>>>
>>>>>>> My concern is that this makes the kernel less strict by default. Given
>>>>>>> this is only a test capability lets add a module option to override
>>>>>>> the default dsm_mask.
>>>>>>
>>>>>> This part isn't strictly a test capability.  It's also to allow older
>>>>>> kernels to be supported with newer NVDIMM hardware, firmware, and management tools.
>>>>>> We shouldn't need a customer to update their production kernel just to support
>>>>>> an NVDIMM management tool.
>>>>>>
>>>>>> As for less secure by default, the default is to allow undocumented
>>>>>> functions of the "vendor specific" type.  How is the really different?
>>>>>>
>>>>>
>>>>> It's about pushing back on undocumented BIOS interfaces as much as
>>>>> possible.
>>>>
>>>> I'm not looking to propagate a bunch of undocumented interfaces.  I'm
>>>> looking to avoid having distros release new kernels and customers update
>>>> their systems because there's a new documented interface, especially when the only
>>>> thing that needs to change is the mask.  As you've experienced, the standardization
>>>> process takes a long time and I'm anticipating that between now and
>>>> eventual standardization, there will be at least one new DSM that's needed.
>>>> I'm sure we'll document it too, but that won't help a customer running an
>>>> older kernel.
>>>>
>>>>> The kernel has the documented set enabled by default,
>>>>> including the documented vendor-specifc function code.
>>>>
>>>> Ok, so if I update my document to say that undefined values are vendor-specific,
>>>> we're good?  Or shall I define the unused values that way explicitly?
>>>>
>>>>> There is the option to disable the vendor-specific tunnel to restrict the kernel to
>>>>> only allowing commands with publicly documented effects.
>>>>
>>>> My patch respects that option.
>>>>
>>>>> With a new "default_dsm_mask" parameter the default kernel policy can be
>>>>> overridden by the system owner.
>>>>
>>>> It can, although it means customers would have to add kernel parameters, which
>>>> they try to avoid.  I assume that's why the module parameter for a stricter
>>>> policy isn't the default since customers using Intel management tools would need to
>>>> add a parameter to allow the use of the vendor-specific function.
>>>>
>>>> I'm not trying to be difficult here, but I really don't see the difference
>>>> between a documented function with undocumented behavior and an
>>>> undocumented function with undocumented behavior.
>>>
>>> The difference is that it is slightly more painful to require a kernel
>>> module override. Disabling these commands by default has some effect
>>> of pushing back on the creation of new interfaces.
>>
>> Intel's vendor-specific function, which can expose new interfaces,
>> requires no override.
>>
>
> Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
> option to disable all vendor specific passthroughs if an environment
> only wants to run commands with publicly documented effects.

Ok, I can add some to the HPE1 family too.  :-)

> There's also nothing on the kernel side stopping any vendor from
> implementing the "HPE2" or "INTEL" vendor specific interfaces in their
> BIOS, the interfaces are not exclusive.

I'm sure that many vendors will implement the INTEL interfaces when
there are devices using the 0x0201 RFIC.  However, those aren't sufficient
for NVDIMM-N devices.

As for HPE2, it may be short lived.  I'm not sure it's in the wild and
if it's not, perhaps we can remove it.  Still checking.

>>> Customers are
>>> already prepared to need new kernels for new enabling.
>>
>> This may not be new enabling.  It could be hardware they already have
>> but there is a FW update or a management feature or some error recovery
>> that gets turned on.
>>
>
> I consider new platform functionality hardware or firmware as new enabling.

A customer may not see it that way.

>>> It's a similar coordination we do for new device-ids for drivers, but we still allow
>>> for overriding the default via the "new_id" interface in sysfs.
>>>
>>> Look at the reaction we got from the community the last time this
>>> subject was brought up [1].
>>
>> I actually think we handled that objection in the next few messages.
>>
>
> I think we placated it with the strict checking, but the fact that the
> kernel continues to have 4 nvdimm-family-types is still a point of
> contention.

We will likely end up supporting a 5th one as we move to a standard.

>>> That reaction stems from the historical
>>> divergence of storage management interfaces requiring Linux to have a
>>> vendor-specific toolkit per device. Requiring a kernel change to make
>>> a command supported by default is a good thing because it affords
>>> ongoing opportunities to find commonalities between vendors and
>>> support common tooling.
>>
>> A good thing for who?  Not for the distro or the customer.  I don't
>> see how this is different than Intel adding new features through your generic
>> vendor-specific function, which would not require a kernel spin.
>
> A distro would rather carry one utility that handles multiple vendors.
> The kernel would rather carry support code for one generic interface
> than multiple interfaces.

We'd all like that - but we don't have that yet.

> A kernel spin is not required with the proposed module parameter.

True, but configuring boot options and a reboot are required.

>>> I hope the need for the vendor-specific tunnels goes away over time
>>> and Linux can generically support the most common management tasks
>>> with generic infrastructure.
>>
>> We totally agree.  In case you're wondering, we don't have a new
>> DSM queued up just waiting to be exposed by this patch.  However, cases
>> have come up where we have considered the need for a new DSM function
>> outside of pure test environments.  So far we have come up with other
>> approaches but at some point, the right approach will be to add a new DSM
>> function and I'd really like to not have to spin the kernel and all that
>> goes along with that when it happens. It's not good for customers.
>>
>
> I don't understand why a module option is unacceptable when that is
> the proposed solution to the kernel picking the wrong "default"
> family.

The difference is that picking the default family is primarily for
testing while the other case it's not.  Testers do all sorts of things
that customers won't like.

> In fact, thinking about it further, I'd be more open to a patch that
> allows overriding a DIMM's family via sysfs.

Sorry if this is a dumb question but will that work at boot time?  Today
there are DSMs that are called during initialization.  There could also
be DSMs called based on whether NFIT device flag bit 4 is set, although
we don't currently do that.

> That way we can
> potentially have cross usage of these different interfaces and it's
> less awkward for tooling to use than a module option.

For testing purposes, being able to switch DSM families without a reboot
would be really nice.  You could even expose the GUID and allow it, the
family, and the dsm mask to be overridden. Would be helpful when testing
that new standard DSM we all aspire to.

This approach would touch a lot more code and require a lot more testing
than my relatively simple patches because today these things are configured
at boot time rather than run time.  I wonder about ordering and consistency
checking.  For example, if someone changed the family would you automatically 
change the mask or is that two operations?  I assume you'd check
that the family is actually supported for the device?  With sysfs, different
devices could have different families, where with the module parameter option
there's one default that's applied to all devices for which that family is
supported.

At this point I'd probably prefer the simplicity of the module parameter
because I know I can do it and test it.  If the only way you'll take the
patch is to add a second parameter then I'll do that, but I still don't
see the point.

-- ljk
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 18:51                 ` Linda Knippers
@ 2017-02-16 19:35                   ` Dan Williams
  2017-02-16 20:13                     ` Linda Knippers
  2017-02-19 16:28                     ` Boaz Harrosh
  0 siblings, 2 replies; 25+ messages in thread
From: Dan Williams @ 2017-02-16 19:35 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Thu, Feb 16, 2017 at 10:51 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
>> option to disable all vendor specific passthroughs if an environment
>> only wants to run commands with publicly documented effects.
>
>
> Ok, I can add some to the HPE1 family too.  :-)

Sure, and then the pushback would be "no, just use HPE2 or INTEL". We
have two kernel-supported methods for sending an opaque
vendor-specific command. Why would the kernel need to support a third?

>> There's also nothing on the kernel side stopping any vendor from
>> implementing the "HPE2" or "INTEL" vendor specific interfaces in their
>> BIOS, the interfaces are not exclusive.
>
>
> I'm sure that many vendors will implement the INTEL interfaces when
> there are devices using the 0x0201 RFIC.  However, those aren't sufficient
> for NVDIMM-N devices.
>
> As for HPE2, it may be short lived.  I'm not sure it's in the wild and
> if it's not, perhaps we can remove it.  Still checking.

This is an example of why the kernel does not add kernel enabling for
future "maybe" cases. The fact that this patch wants to allow the
kernel to maybe support a command that doesn't exist yet is a clear
reason to say no.

>>>> Customers are
>>>> already prepared to need new kernels for new enabling.
>>>
>>>
>>> This may not be new enabling.  It could be hardware they already have
>>> but there is a FW update or a management feature or some error recovery
>>> that gets turned on.
>>>
>>
>> I consider new platform functionality hardware or firmware as new
>> enabling.
>
>
> A customer may not see it that way.
>
>>>> It's a similar coordination we do for new device-ids for drivers, but we
>>>> still allow
>>>> for overriding the default via the "new_id" interface in sysfs.
>>>>
>>>> Look at the reaction we got from the community the last time this
>>>> subject was brought up [1].
>>>
>>>
>>> I actually think we handled that objection in the next few messages.
>>>
>>
>> I think we placated it with the strict checking, but the fact that the
>> kernel continues to have 4 nvdimm-family-types is still a point of
>> contention.
>
>
> We will likely end up supporting a 5th one as we move to a standard.
>
>>>> That reaction stems from the historical
>>>> divergence of storage management interfaces requiring Linux to have a
>>>> vendor-specific toolkit per device. Requiring a kernel change to make
>>>> a command supported by default is a good thing because it affords
>>>> ongoing opportunities to find commonalities between vendors and
>>>> support common tooling.
>>>
>>>
>>> A good thing for who?  Not for the distro or the customer.  I don't
>>> see how this is different than Intel adding new features through your
>>> generic
>>> vendor-specific function, which would not require a kernel spin.
>>
>>
>> A distro would rather carry one utility that handles multiple vendors.
>> The kernel would rather carry support code for one generic interface
>> than multiple interfaces.
>
>
> We'd all like that - but we don't have that yet.
>
>> A kernel spin is not required with the proposed module parameter.
>
>
> True, but configuring boot options and a reboot are required.

Module options are specified at module load time so you don't
necessarily need a reboot. However, I'm now of the opinion that
allowing the family to be changed is a more complete solution to both
problems.

>>>> I hope the need for the vendor-specific tunnels goes away over time
>>>> and Linux can generically support the most common management tasks
>>>> with generic infrastructure.
>>>
>>>
>>> We totally agree.  In case you're wondering, we don't have a new
>>> DSM queued up just waiting to be exposed by this patch.  However, cases
>>> have come up where we have considered the need for a new DSM function
>>> outside of pure test environments.  So far we have come up with other
>>> approaches but at some point, the right approach will be to add a new DSM
>>> function and I'd really like to not have to spin the kernel and all that
>>> goes along with that when it happens. It's not good for customers.
>>>
>>
>> I don't understand why a module option is unacceptable when that is
>> the proposed solution to the kernel picking the wrong "default"
>> family.
>
>
> The difference is that picking the default family is primarily for
> testing while the other case it's not.  Testers do all sorts of things
> that customers won't like.
>
>> In fact, thinking about it further, I'd be more open to a patch that
>> allows overriding a DIMM's family via sysfs.
>
>
> Sorry if this is a dumb question but will that work at boot time?  Today
> there are DSMs that are called during initialization.

The only DIMM-level DSMs that are called during initialization are
label management, those aren't the DSMs we are talking about here.

> There could also
> be DSMs called based on whether NFIT device flag bit 4 is set, although
> we don't currently do that.

Sorry, I'm not sure to which NFIT bit this is referring? I'm not sure
it matters if the kernel is never going to initiate and consume the
result of the DSM like it does label data.

>> That way we can
>> potentially have cross usage of these different interfaces and it's
>> less awkward for tooling to use than a module option.
>
>
> For testing purposes, being able to switch DSM families without a reboot
> would be really nice.  You could even expose the GUID and allow it, the
> family, and the dsm mask to be overridden. Would be helpful when testing
> that new standard DSM we all aspire to.
>
> This approach would touch a lot more code and require a lot more testing
> than my relatively simple patches because today these things are configured
> at boot time rather than run time.  I wonder about ordering and consistency
> checking.  For example, if someone changed the family would you
> automatically change the mask or is that two operations?  I assume you'd
> check
> that the family is actually supported for the device?  With sysfs, different
> devices could have different families, where with the module parameter
> option
> there's one default that's applied to all devices for which that family is
> supported.
>
> At this point I'd probably prefer the simplicity of the module parameter
> because I know I can do it and test it.  If the only way you'll take the
> patch is to add a second parameter then I'll do that, but I still don't
> see the point.

Let's do the work to allow the family to be switched so that tooling
can override the kernel default, because it does seem valid for a DIMM
to support multiple DSM-family types and that gives more than one path
to use a vendor specific passthrough.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 19:35                   ` Dan Williams
@ 2017-02-16 20:13                     ` Linda Knippers
  2017-02-16 20:48                       ` Dan Williams
  2017-02-19 16:28                     ` Boaz Harrosh
  1 sibling, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-16 20:13 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On 02/16/2017 02:35 PM, Dan Williams wrote:
> On Thu, Feb 16, 2017 at 10:51 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
>>> option to disable all vendor specific passthroughs if an environment
>>> only wants to run commands with publicly documented effects.
>>
>>
>> Ok, I can add some to the HPE1 family too.  :-)
> 
> Sure, and then the pushback would be "no, just use HPE2 or INTEL". We
> have two kernel-supported methods for sending an opaque
> vendor-specific command. Why would the kernel need to support a third?

I can't use Intel's because I have an NVDIMM-N and your spec is not
for NVDIMM-N.

>>> There's also nothing on the kernel side stopping any vendor from
>>> implementing the "HPE2" or "INTEL" vendor specific interfaces in their
>>> BIOS, the interfaces are not exclusive.
>>
>>
>> I'm sure that many vendors will implement the INTEL interfaces when
>> there are devices using the 0x0201 RFIC.  However, those aren't sufficient
>> for NVDIMM-N devices.
>>
>> As for HPE2, it may be short lived.  I'm not sure it's in the wild and
>> if it's not, perhaps we can remove it.  Still checking.
> 
> This is an example of why the kernel does not add kernel enabling for
> future "maybe" cases. The fact that this patch wants to allow the
> kernel to maybe support a command that doesn't exist yet is a clear
> reason to say no.

The kernel often has code for devices that aren't available yet,
like NVDIMMs with 0x201. Sometimes things change or eventually get deprecated.
The Intel DSM spec has changed over time as well. But as I said, I'm still checking.

> 
>>>>> Customers are
>>>>> already prepared to need new kernels for new enabling.
>>>>
>>>>
>>>> This may not be new enabling.  It could be hardware they already have
>>>> but there is a FW update or a management feature or some error recovery
>>>> that gets turned on.
>>>>
>>>
>>> I consider new platform functionality hardware or firmware as new
>>> enabling.
>>
>>
>> A customer may not see it that way.
>>
>>>>> It's a similar coordination we do for new device-ids for drivers, but we
>>>>> still allow
>>>>> for overriding the default via the "new_id" interface in sysfs.
>>>>>
>>>>> Look at the reaction we got from the community the last time this
>>>>> subject was brought up [1].
>>>>
>>>>
>>>> I actually think we handled that objection in the next few messages.
>>>>
>>>
>>> I think we placated it with the strict checking, but the fact that the
>>> kernel continues to have 4 nvdimm-family-types is still a point of
>>> contention.
>>
>>
>> We will likely end up supporting a 5th one as we move to a standard.
>>
>>>>> That reaction stems from the historical
>>>>> divergence of storage management interfaces requiring Linux to have a
>>>>> vendor-specific toolkit per device. Requiring a kernel change to make
>>>>> a command supported by default is a good thing because it affords
>>>>> ongoing opportunities to find commonalities between vendors and
>>>>> support common tooling.
>>>>
>>>>
>>>> A good thing for who?  Not for the distro or the customer.  I don't
>>>> see how this is different than Intel adding new features through your
>>>> generic
>>>> vendor-specific function, which would not require a kernel spin.
>>>
>>>
>>> A distro would rather carry one utility that handles multiple vendors.
>>> The kernel would rather carry support code for one generic interface
>>> than multiple interfaces.
>>
>>
>> We'd all like that - but we don't have that yet.
>>
>>> A kernel spin is not required with the proposed module parameter.
>>
>>
>> True, but configuring boot options and a reboot are required.
> 
> Module options are specified at module load time so you don't
> necessarily need a reboot. 

Reloading nfit has been a problem for me with dependencies between
other modules but perhaps I've just not gotten the sequence right.
Does it work for you?

> However, I'm now of the opinion that
> allowing the family to be changed is a more complete solution to both
> problems.
> 
>>>>> I hope the need for the vendor-specific tunnels goes away over time
>>>>> and Linux can generically support the most common management tasks
>>>>> with generic infrastructure.
>>>>
>>>>
>>>> We totally agree.  In case you're wondering, we don't have a new
>>>> DSM queued up just waiting to be exposed by this patch.  However, cases
>>>> have come up where we have considered the need for a new DSM function
>>>> outside of pure test environments.  So far we have come up with other
>>>> approaches but at some point, the right approach will be to add a new DSM
>>>> function and I'd really like to not have to spin the kernel and all that
>>>> goes along with that when it happens. It's not good for customers.
>>>>
>>>
>>> I don't understand why a module option is unacceptable when that is
>>> the proposed solution to the kernel picking the wrong "default"
>>> family.
>>
>>
>> The difference is that picking the default family is primarily for
>> testing while the other case it's not.  Testers do all sorts of things
>> that customers won't like.
>>
>>> In fact, thinking about it further, I'd be more open to a patch that
>>> allows overriding a DIMM's family via sysfs.
>>
>>
>> Sorry if this is a dumb question but will that work at boot time?  Today
>> there are DSMs that are called during initialization.
> 
> The only DIMM-level DSMs that are called during initialization are
> label management, those aren't the DSMs we are talking about here.
> 
>> There could also
>> be DSMs called based on whether NFIT device flag bit 4 is set, although
>> we don't currently do that.
> 
> Sorry, I'm not sure to which NFIT bit this is referring? I'm not sure
> it matters if the kernel is never going to initiate and consume the
> result of the DSM like it does label data.

There's a state flag that says there was a health status change prior
to the OSPM handoff.  We could eventually want to call that to know more
about whether the NVDIMM is viable for use as there is more information there
than in the basic state flags.  We don't do that today though, just
thinking about the future.

> 
>>> That way we can
>>> potentially have cross usage of these different interfaces and it's
>>> less awkward for tooling to use than a module option.
>>
>>
>> For testing purposes, being able to switch DSM families without a reboot
>> would be really nice.  You could even expose the GUID and allow it, the
>> family, and the dsm mask to be overridden. Would be helpful when testing
>> that new standard DSM we all aspire to.
>>
>> This approach would touch a lot more code and require a lot more testing
>> than my relatively simple patches because today these things are configured
>> at boot time rather than run time.  I wonder about ordering and consistency
>> checking.  For example, if someone changed the family would you
>> automatically change the mask or is that two operations?  I assume you'd
>> check
>> that the family is actually supported for the device?  With sysfs, different
>> devices could have different families, where with the module parameter
>> option
>> there's one default that's applied to all devices for which that family is
>> supported.
>>
>> At this point I'd probably prefer the simplicity of the module parameter
>> because I know I can do it and test it.  If the only way you'll take the
>> patch is to add a second parameter then I'll do that, but I still don't
>> see the point.
> 
> Let's do the work to allow the family to be switched so that tooling
> can override the kernel default, because it does seem valid for a DIMM
> to support multiple DSM-family types and that gives more than one path
> to use a vendor specific passthrough.

I am fairly certain that for production use we wouldn't expect management
tools to use more than one DSM family at a time. Tools using DSMs
from one family wouldn't switch to using a new family for one operation
and then switch back for another operation.

We haven't talked about the 4th DSM family and what their plans are but
I won't be shocked if they create new functions.  That family is very
hardware specific and hardware specs also evolve.

-- ljk

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 20:13                     ` Linda Knippers
@ 2017-02-16 20:48                       ` Dan Williams
  2017-02-16 21:07                         ` Dan Williams
  2017-02-16 22:33                         ` Linda Knippers
  0 siblings, 2 replies; 25+ messages in thread
From: Dan Williams @ 2017-02-16 20:48 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> On 02/16/2017 02:35 PM, Dan Williams wrote:
>> On Thu, Feb 16, 2017 at 10:51 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>> Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
>>>> option to disable all vendor specific passthroughs if an environment
>>>> only wants to run commands with publicly documented effects.
>>>
>>>
>>> Ok, I can add some to the HPE1 family too.  :-)
>>
>> Sure, and then the pushback would be "no, just use HPE2 or INTEL". We
>> have two kernel-supported methods for sending an opaque
>> vendor-specific command. Why would the kernel need to support a third?
>
> I can't use Intel's because I have an NVDIMM-N and your spec is not
> for NVDIMM-N.

I don't understand this comment in the context of a vendor specific
tunnel. There's nothing NVDIMM-N specific about the DSM payload
format. I'm saying override the default family temporarily just to use
the vendor-specific tunnel, not implement the full "INTEL" definition.
This can be used as a stop-gap when you find you need to support a new
command on a legacy kernel.

>>>> There's also nothing on the kernel side stopping any vendor from
>>>> implementing the "HPE2" or "INTEL" vendor specific interfaces in their
>>>> BIOS, the interfaces are not exclusive.
>>>
>>>
>>> I'm sure that many vendors will implement the INTEL interfaces when
>>> there are devices using the 0x0201 RFIC.  However, those aren't sufficient
>>> for NVDIMM-N devices.
>>>
>>> As for HPE2, it may be short lived.  I'm not sure it's in the wild and
>>> if it's not, perhaps we can remove it.  Still checking.
>>
>> This is an example of why the kernel does not add kernel enabling for
>> future "maybe" cases. The fact that this patch wants to allow the
>> kernel to maybe support a command that doesn't exist yet is a clear
>> reason to say no.
>
> The kernel often has code for devices that aren't available yet,
> like NVDIMMs with 0x201. Sometimes things change or eventually get deprecated.
> The Intel DSM spec has changed over time as well. But as I said, I'm still checking.

I'm not sure that's a good example because block-apertures are
described by ACPI, and the kernel has no specific enabling around the
format-interface-code. Whereas this patch to expand the dsm_mask is
planning ahead for DSMs that may not ever be defined.

[..]
>> Module options are specified at module load time so you don't
>> necessarily need a reboot.
>
> Reloading nfit has been a problem for me with dependencies between
> other modules but perhaps I've just not gotten the sequence right.
> Does it work for you?

Yes:

    ndctl disable-region all
    modprobe -r nfit

It's best to unmount any filesystems mounted on nvdimm namespaces
before disabling regions. There's an ongoing effort to make if safe
[1], but not all filesystems are prepared for their backing device to
disappear.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-January/003797.html

>> However, I'm now of the opinion that
>> allowing the family to be changed is a more complete solution to both
>> problems.
>>
>>>>>> I hope the need for the vendor-specific tunnels goes away over time
>>>>>> and Linux can generically support the most common management tasks
>>>>>> with generic infrastructure.
>>>>>
>>>>>
>>>>> We totally agree.  In case you're wondering, we don't have a new
>>>>> DSM queued up just waiting to be exposed by this patch.  However, cases
>>>>> have come up where we have considered the need for a new DSM function
>>>>> outside of pure test environments.  So far we have come up with other
>>>>> approaches but at some point, the right approach will be to add a new DSM
>>>>> function and I'd really like to not have to spin the kernel and all that
>>>>> goes along with that when it happens. It's not good for customers.
>>>>>
>>>>
>>>> I don't understand why a module option is unacceptable when that is
>>>> the proposed solution to the kernel picking the wrong "default"
>>>> family.
>>>
>>>
>>> The difference is that picking the default family is primarily for
>>> testing while the other case it's not.  Testers do all sorts of things
>>> that customers won't like.
>>>
>>>> In fact, thinking about it further, I'd be more open to a patch that
>>>> allows overriding a DIMM's family via sysfs.
>>>
>>>
>>> Sorry if this is a dumb question but will that work at boot time?  Today
>>> there are DSMs that are called during initialization.
>>
>> The only DIMM-level DSMs that are called during initialization are
>> label management, those aren't the DSMs we are talking about here.
>>
>>> There could also
>>> be DSMs called based on whether NFIT device flag bit 4 is set, although
>>> we don't currently do that.
>>
>> Sorry, I'm not sure to which NFIT bit this is referring? I'm not sure
>> it matters if the kernel is never going to initiate and consume the
>> result of the DSM like it does label data.
>
> There's a state flag that says there was a health status change prior
> to the OSPM handoff.  We could eventually want to call that to know more
> about whether the NVDIMM is viable for use as there is more information there
> than in the basic state flags.  We don't do that today though, just
> thinking about the future.

The kernel passes that status and any health-event notification to
userspace. The kernel doesn't consume it directly because the health
status payload is not standardized across vendors, and it keeps the
kernel focused on the mechanism while the policy of what to do with a
health event is a userspace concern. This is the same scheme the
kernel has for handling health status alerts from disks.

>>>> That way we can
>>>> potentially have cross usage of these different interfaces and it's
>>>> less awkward for tooling to use than a module option.
>>>
>>>
>>> For testing purposes, being able to switch DSM families without a reboot
>>> would be really nice.  You could even expose the GUID and allow it, the
>>> family, and the dsm mask to be overridden. Would be helpful when testing
>>> that new standard DSM we all aspire to.
>>>
>>> This approach would touch a lot more code and require a lot more testing
>>> than my relatively simple patches because today these things are configured
>>> at boot time rather than run time.  I wonder about ordering and consistency
>>> checking.  For example, if someone changed the family would you
>>> automatically change the mask or is that two operations?  I assume you'd
>>> check
>>> that the family is actually supported for the device?  With sysfs, different
>>> devices could have different families, where with the module parameter
>>> option
>>> there's one default that's applied to all devices for which that family is
>>> supported.
>>>
>>> At this point I'd probably prefer the simplicity of the module parameter
>>> because I know I can do it and test it.  If the only way you'll take the
>>> patch is to add a second parameter then I'll do that, but I still don't
>>> see the point.
>>
>> Let's do the work to allow the family to be switched so that tooling
>> can override the kernel default, because it does seem valid for a DIMM
>> to support multiple DSM-family types and that gives more than one path
>> to use a vendor specific passthrough.
>
> I am fairly certain that for production use we wouldn't expect management
> tools to use more than one DSM family at a time. Tools using DSMs
> from one family wouldn't switch to using a new family for one operation
> and then switch back for another operation.

If the kernel makes it safe I don't see why this is a blocker,
especially if the goal is to reduce interface proliferation over time.

> We haven't talked about the 4th DSM family and what their plans are but
> I won't be shocked if they create new functions.  That family is very
> hardware specific and hardware specs also evolve.

Yes, and we can add support for new functions when and if they define
them, not in advance.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 20:48                       ` Dan Williams
@ 2017-02-16 21:07                         ` Dan Williams
  2017-02-16 22:40                           ` Linda Knippers
  2017-02-16 22:33                         ` Linda Knippers
  1 sibling, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-16 21:07 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Thu, Feb 16, 2017 at 12:48 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
[..]
>> I am fairly certain that for production use we wouldn't expect management
>> tools to use more than one DSM family at a time. Tools using DSMs
>> from one family wouldn't switch to using a new family for one operation
>> and then switch back for another operation.
>
> If the kernel makes it safe I don't see why this is a blocker,
> especially if the goal is to reduce interface proliferation over time.

In fact, we can make it even simpler than dynamically changing family,
just have ND_IOCTL_VENDOR always route to uuid
""4309ac30-0d11-11e4-9191-0800200c9a66" and function number 9. Then
that becomes the de facto method for Linux to issue a vendor-specific
command for any DIMM.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 20:48                       ` Dan Williams
  2017-02-16 21:07                         ` Dan Williams
@ 2017-02-16 22:33                         ` Linda Knippers
  1 sibling, 0 replies; 25+ messages in thread
From: Linda Knippers @ 2017-02-16 22:33 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm



On 2/16/2017 3:48 PM, Dan Williams wrote:
> On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> On 02/16/2017 02:35 PM, Dan Williams wrote:
>>> On Thu, Feb 16, 2017 at 10:51 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>>> Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
>>>>> option to disable all vendor specific passthroughs if an environment
>>>>> only wants to run commands with publicly documented effects.
>>>>
>>>>
>>>> Ok, I can add some to the HPE1 family too.  :-)
>>>
>>> Sure, and then the pushback would be "no, just use HPE2 or INTEL". We
>>> have two kernel-supported methods for sending an opaque
>>> vendor-specific command. Why would the kernel need to support a third?
>>
>> I can't use Intel's because I have an NVDIMM-N and your spec is not
>> for NVDIMM-N.
>
> I don't understand this comment in the context of a vendor specific
> tunnel. There's nothing NVDIMM-N specific about the DSM payload
> format. I'm saying override the default family temporarily just to use
> the vendor-specific tunnel, not implement the full "INTEL" definition.
> This can be used as a stop-gap when you find you need to support a new
> command on a legacy kernel.

That presumes that the platform FW supports the INTEL DSM.  Today's
platforms that support NVDIMM-N do not.  It is quite likely that
even when platform do support NVDIMMs with RFIC 0x201, FW wouldn't
expose the DSM family if the hardware isn't actually installed.

>>>>> There's also nothing on the kernel side stopping any vendor from
>>>>> implementing the "HPE2" or "INTEL" vendor specific interfaces in their
>>>>> BIOS, the interfaces are not exclusive.
>>>>
>>>>
>>>> I'm sure that many vendors will implement the INTEL interfaces when
>>>> there are devices using the 0x0201 RFIC.  However, those aren't sufficient
>>>> for NVDIMM-N devices.
>>>>
>>>> As for HPE2, it may be short lived.  I'm not sure it's in the wild and
>>>> if it's not, perhaps we can remove it.  Still checking.
>>>
>>> This is an example of why the kernel does not add kernel enabling for
>>> future "maybe" cases. The fact that this patch wants to allow the
>>> kernel to maybe support a command that doesn't exist yet is a clear
>>> reason to say no.
>>
>> The kernel often has code for devices that aren't available yet,
>> like NVDIMMs with 0x201. Sometimes things change or eventually get deprecated.
>> The Intel DSM spec has changed over time as well. But as I said, I'm still checking.
>
> I'm not sure that's a good example because block-apertures are
> described by ACPI, and the kernel has no specific enabling around the
> format-interface-code. Whereas this patch to expand the dsm_mask is
> planning ahead for DSMs that may not ever be defined.

My point was that things change.  They sometimes change before hardware is
shipping.  That's why your spec is up to version 1.3 and still changing.
I'm just trying to acknowledge that change happens and plan for it.

BTW, we do have HPE2 in the wild so never mind about removing it.
>
> [..]
>>> Module options are specified at module load time so you don't
>>> necessarily need a reboot.
>>
>> Reloading nfit has been a problem for me with dependencies between
>> other modules but perhaps I've just not gotten the sequence right.
>> Does it work for you?
>
> Yes:
>
>     ndctl disable-region all
>     modprobe -r nfit

Thanks!

> It's best to unmount any filesystems mounted on nvdimm namespaces
> before disabling regions. There's an ongoing effort to make if safe
> [1], but not all filesystems are prepared for their backing device to
> disappear.
>
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2016-January/003797.html
>
>>> However, I'm now of the opinion that
>>> allowing the family to be changed is a more complete solution to both
>>> problems.
>>>
>>>>>>> I hope the need for the vendor-specific tunnels goes away over time
>>>>>>> and Linux can generically support the most common management tasks
>>>>>>> with generic infrastructure.
>>>>>>
>>>>>>
>>>>>> We totally agree.  In case you're wondering, we don't have a new
>>>>>> DSM queued up just waiting to be exposed by this patch.  However, cases
>>>>>> have come up where we have considered the need for a new DSM function
>>>>>> outside of pure test environments.  So far we have come up with other
>>>>>> approaches but at some point, the right approach will be to add a new DSM
>>>>>> function and I'd really like to not have to spin the kernel and all that
>>>>>> goes along with that when it happens. It's not good for customers.
>>>>>>
>>>>>
>>>>> I don't understand why a module option is unacceptable when that is
>>>>> the proposed solution to the kernel picking the wrong "default"
>>>>> family.
>>>>
>>>>
>>>> The difference is that picking the default family is primarily for
>>>> testing while the other case it's not.  Testers do all sorts of things
>>>> that customers won't like.
>>>>
>>>>> In fact, thinking about it further, I'd be more open to a patch that
>>>>> allows overriding a DIMM's family via sysfs.
>>>>
>>>>
>>>> Sorry if this is a dumb question but will that work at boot time?  Today
>>>> there are DSMs that are called during initialization.
>>>
>>> The only DIMM-level DSMs that are called during initialization are
>>> label management, those aren't the DSMs we are talking about here.
>>>
>>>> There could also
>>>> be DSMs called based on whether NFIT device flag bit 4 is set, although
>>>> we don't currently do that.
>>>
>>> Sorry, I'm not sure to which NFIT bit this is referring? I'm not sure
>>> it matters if the kernel is never going to initiate and consume the
>>> result of the DSM like it does label data.
>>
>> There's a state flag that says there was a health status change prior
>> to the OSPM handoff.  We could eventually want to call that to know more
>> about whether the NVDIMM is viable for use as there is more information there
>> than in the basic state flags.  We don't do that today though, just
>> thinking about the future.
>
> The kernel passes that status and any health-event notification to
> userspace. The kernel doesn't consume it directly because the health
> status payload is not standardized across vendors, and it keeps the
> kernel focused on the mechanism while the policy of what to do with a
> health event is a userspace concern. This is the same scheme the
> kernel has for handling health status alerts from disks.
>
>>>>> That way we can
>>>>> potentially have cross usage of these different interfaces and it's
>>>>> less awkward for tooling to use than a module option.
>>>>
>>>>
>>>> For testing purposes, being able to switch DSM families without a reboot
>>>> would be really nice.  You could even expose the GUID and allow it, the
>>>> family, and the dsm mask to be overridden. Would be helpful when testing
>>>> that new standard DSM we all aspire to.
>>>>
>>>> This approach would touch a lot more code and require a lot more testing
>>>> than my relatively simple patches because today these things are configured
>>>> at boot time rather than run time.  I wonder about ordering and consistency
>>>> checking.  For example, if someone changed the family would you
>>>> automatically change the mask or is that two operations?  I assume you'd
>>>> check
>>>> that the family is actually supported for the device?  With sysfs, different
>>>> devices could have different families, where with the module parameter
>>>> option
>>>> there's one default that's applied to all devices for which that family is
>>>> supported.
>>>>
>>>> At this point I'd probably prefer the simplicity of the module parameter
>>>> because I know I can do it and test it.  If the only way you'll take the
>>>> patch is to add a second parameter then I'll do that, but I still don't
>>>> see the point.
>>>
>>> Let's do the work to allow the family to be switched so that tooling
>>> can override the kernel default, because it does seem valid for a DIMM
>>> to support multiple DSM-family types and that gives more than one path
>>> to use a vendor specific passthrough.
>>
>> I am fairly certain that for production use we wouldn't expect management
>> tools to use more than one DSM family at a time. Tools using DSMs
>> from one family wouldn't switch to using a new family for one operation
>> and then switch back for another operation.
>
> If the kernel makes it safe I don't see why this is a blocker,
> especially if the goal is to reduce interface proliferation over time.

See my earlier comments.  It would require more than the kernel
making it safe.  It gets even more complicated when a platform supports
an 0x201 NVDIMM and an NVDIMM-N at the same time.

We all share the goal of reducing interface proliferation.  We disagree
on the enforcement while we define more complete standards.

>> We haven't talked about the 4th DSM family and what their plans are but
>> I won't be shocked if they create new functions.  That family is very
>> hardware specific and hardware specs also evolve.
>
> Yes, and we can add support for new functions when and if they define
> them, not in advance.

The problem is timing.  New FW and tools can be in customers hands much
more quickly than an OS release.  If a vendor documents a function today
and you pull in a patch adding the bit tomorrow, it could still be a year
or more before that patch makes it into the distro that the customer is running.
That is the fundamental issue that I'm trying to address.

--ljk
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 21:07                         ` Dan Williams
@ 2017-02-16 22:40                           ` Linda Knippers
  2017-02-16 22:43                             ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-16 22:40 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm



On 2/16/2017 4:07 PM, Dan Williams wrote:
> On Thu, Feb 16, 2017 at 12:48 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> [..]
>>> I am fairly certain that for production use we wouldn't expect management
>>> tools to use more than one DSM family at a time. Tools using DSMs
>>> from one family wouldn't switch to using a new family for one operation
>>> and then switch back for another operation.
>>
>> If the kernel makes it safe I don't see why this is a blocker,
>> especially if the goal is to reduce interface proliferation over time.
>
> In fact, we can make it even simpler than dynamically changing family,
> just have ND_IOCTL_VENDOR always route to uuid
> ""4309ac30-0d11-11e4-9191-0800200c9a66" and function number 9. Then
> that becomes the de facto method for Linux to issue a vendor-specific
> command for any DIMM.

Several problems.
That family is documented for RFIC 0x0201.
No NVDIMM-N platform supports that DSM family.
When platforms do support 0x0201, some will also support NVDIMM-N.  Since 
opcodes are not documented, how do would they not step on each other?

I don't think that works at all.

-- ljk


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 22:40                           ` Linda Knippers
@ 2017-02-16 22:43                             ` Dan Williams
  2017-02-16 22:47                               ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-16 22:43 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Thu, Feb 16, 2017 at 2:40 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>
>
> On 2/16/2017 4:07 PM, Dan Williams wrote:
>>
>> On Thu, Feb 16, 2017 at 12:48 PM, Dan Williams <dan.j.williams@intel.com>
>> wrote:
>>>
>>> On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <linda.knippers@hpe.com>
>>> wrote:
>>
>> [..]
>>>>
>>>> I am fairly certain that for production use we wouldn't expect
>>>> management
>>>> tools to use more than one DSM family at a time. Tools using DSMs
>>>> from one family wouldn't switch to using a new family for one operation
>>>> and then switch back for another operation.
>>>
>>>
>>> If the kernel makes it safe I don't see why this is a blocker,
>>> especially if the goal is to reduce interface proliferation over time.
>>
>>
>> In fact, we can make it even simpler than dynamically changing family,
>> just have ND_IOCTL_VENDOR always route to uuid
>> ""4309ac30-0d11-11e4-9191-0800200c9a66" and function number 9. Then
>> that becomes the de facto method for Linux to issue a vendor-specific
>> command for any DIMM.
>
>
> Several problems.
> That family is documented for RFIC 0x0201.
> No NVDIMM-N platform supports that DSM family.
> When platforms do support 0x0201, some will also support NVDIMM-N.  Since
> opcodes are not documented, how do would they not step on each other?
>
> I don't think that works at all.
>

I'm confused there is no correlation between FIC and DSM family, at
least not one Linux enforces.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 22:43                             ` Dan Williams
@ 2017-02-16 22:47                               ` Linda Knippers
  2017-02-16 22:49                                 ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Linda Knippers @ 2017-02-16 22:47 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm



On 2/16/2017 5:43 PM, Dan Williams wrote:
> On Thu, Feb 16, 2017 at 2:40 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>
>>
>> On 2/16/2017 4:07 PM, Dan Williams wrote:
>>>
>>> On Thu, Feb 16, 2017 at 12:48 PM, Dan Williams <dan.j.williams@intel.com>
>>> wrote:
>>>>
>>>> On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <linda.knippers@hpe.com>
>>>> wrote:
>>>
>>> [..]
>>>>>
>>>>> I am fairly certain that for production use we wouldn't expect
>>>>> management
>>>>> tools to use more than one DSM family at a time. Tools using DSMs
>>>>> from one family wouldn't switch to using a new family for one operation
>>>>> and then switch back for another operation.
>>>>
>>>>
>>>> If the kernel makes it safe I don't see why this is a blocker,
>>>> especially if the goal is to reduce interface proliferation over time.
>>>
>>>
>>> In fact, we can make it even simpler than dynamically changing family,
>>> just have ND_IOCTL_VENDOR always route to uuid
>>> ""4309ac30-0d11-11e4-9191-0800200c9a66" and function number 9. Then
>>> that becomes the de facto method for Linux to issue a vendor-specific
>>> command for any DIMM.
>>
>>
>> Several problems.
>> That family is documented for RFIC 0x0201.
>> No NVDIMM-N platform supports that DSM family.
>> When platforms do support 0x0201, some will also support NVDIMM-N.  Since
>> opcodes are not documented, how do would they not step on each other?
>>
>> I don't think that works at all.
>>
>
> I'm confused there is no correlation between FIC and DSM family,

Page 7 of version 1.3 of your DSM spec says:

Platforms that have the _DSM interface implemented, as outlined in this section, 
can support a NVDIMM region with Region Format Interface Code (RFIC) of 0x0201.

Other DSM documents reference RFIC 0x0101.

> at least not one Linux enforces.

That is true, but FW developers aren't being as loose.

-- ljk
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 22:47                               ` Linda Knippers
@ 2017-02-16 22:49                                 ` Dan Williams
  2017-02-16 22:52                                   ` Linda Knippers
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2017-02-16 22:49 UTC (permalink / raw)
  To: Linda Knippers; +Cc: linux-nvdimm

On Thu, Feb 16, 2017 at 2:47 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> Several problems.
>>> That family is documented for RFIC 0x0201.
>>> No NVDIMM-N platform supports that DSM family.
>>> When platforms do support 0x0201, some will also support NVDIMM-N.  Since
>>> opcodes are not documented, how do would they not step on each other?
>>>
>>> I don't think that works at all.
>>>
>>
>> I'm confused there is no correlation between FIC and DSM family,
>
>
> Page 7 of version 1.3 of your DSM spec says:
>
> Platforms that have the _DSM interface implemented, as outlined in this
> section, can support a NVDIMM region with Region Format Interface Code
> (RFIC) of 0x0201.
>
> Other DSM documents reference RFIC 0x0101.
>
>> at least not one Linux enforces.
>
>
> That is true, but FW developers aren't being as loose.

Yes, they are. We have 3 0x101 DSM families.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 22:49                                 ` Dan Williams
@ 2017-02-16 22:52                                   ` Linda Knippers
  0 siblings, 0 replies; 25+ messages in thread
From: Linda Knippers @ 2017-02-16 22:52 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm



On 2/16/2017 5:49 PM, Dan Williams wrote:
> On Thu, Feb 16, 2017 at 2:47 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>> Several problems.
>>>> That family is documented for RFIC 0x0201.
>>>> No NVDIMM-N platform supports that DSM family.
>>>> When platforms do support 0x0201, some will also support NVDIMM-N.  Since
>>>> opcodes are not documented, how do would they not step on each other?
>>>>
>>>> I don't think that works at all.
>>>>
>>>
>>> I'm confused there is no correlation between FIC and DSM family,
>>
>>
>> Page 7 of version 1.3 of your DSM spec says:
>>
>> Platforms that have the _DSM interface implemented, as outlined in this
>> section, can support a NVDIMM region with Region Format Interface Code
>> (RFIC) of 0x0201.
>>
>> Other DSM documents reference RFIC 0x0101.
>>
>>> at least not one Linux enforces.
>>
>>
>> That is true, but FW developers aren't being as loose.
>
> Yes, they are. We have 3 0x101 DSM families.

And they're all implemented for 0x101 devices.

-- ljk
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called
  2017-02-16 19:35                   ` Dan Williams
  2017-02-16 20:13                     ` Linda Knippers
@ 2017-02-19 16:28                     ` Boaz Harrosh
  1 sibling, 0 replies; 25+ messages in thread
From: Boaz Harrosh @ 2017-02-19 16:28 UTC (permalink / raw)
  To: Dan Williams, Linda Knippers; +Cc: linux-nvdimm

On 02/16/2017 09:35 PM, Dan Williams wrote:
> On Thu, Feb 16, 2017 at 10:51 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
<>
>> True, but configuring boot options and a reboot are required.
> 
> Module options are specified at module load time so you don't
> necessarily need a reboot. However, I'm now of the opinion that
> allowing the family to be changed is a more complete solution to both
> problems.
<>
> 
> Let's do the work to allow the family to be switched so that tooling
> can override the kernel default, because it does seem valid for a DIMM
> to support multiple DSM-family types and that gives more than one path
> to use a vendor specific passthrough.

BTW:
Any module-parameter is already accessible via sysfs. And can be made
writeable. If you need to run a small function on modification, that
is easy enough, just as any sysfs, give the parameter a function-pointer.

Even if you do an hot event, I'd rather it be as a module-parameter,
because of the already established path in sysfs.

Just my $0.017
Boaz

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-02-19 16:28 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-13 16:27 [PATCH 0/3] apci, nfit: DSM improvements Linda Knippers
2017-02-13 16:27 ` [PATCH 1/3] Allow all supported HPE DSM functions to be called Linda Knippers
2017-02-13 17:28   ` Dan Williams
2017-02-13 18:17     ` Linda Knippers
2017-02-13 18:30       ` Dan Williams
2017-02-15 23:19         ` Linda Knippers
2017-02-16  0:29           ` Dan Williams
2017-02-16  0:45             ` Linda Knippers
2017-02-16  2:03               ` Dan Williams
2017-02-16 18:51                 ` Linda Knippers
2017-02-16 19:35                   ` Dan Williams
2017-02-16 20:13                     ` Linda Knippers
2017-02-16 20:48                       ` Dan Williams
2017-02-16 21:07                         ` Dan Williams
2017-02-16 22:40                           ` Linda Knippers
2017-02-16 22:43                             ` Dan Williams
2017-02-16 22:47                               ` Linda Knippers
2017-02-16 22:49                                 ` Dan Williams
2017-02-16 22:52                                   ` Linda Knippers
2017-02-16 22:33                         ` Linda Knippers
2017-02-19 16:28                     ` Boaz Harrosh
2017-02-13 16:27 ` [PATCH 2/3] Allow specifying a default DSM family Linda Knippers
2017-02-13 16:27 ` [PATCH 3/3] Remove unnecessary newline Linda Knippers
2017-02-13 17:32   ` Dan Williams
2017-02-13 18:02     ` Linda Knippers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.