* [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry @ 2022-03-11 18:10 Brandon Wyman 2022-03-14 4:36 ` Guenter Roeck 0 siblings, 1 reply; 7+ messages in thread From: Brandon Wyman @ 2022-03-11 18:10 UTC (permalink / raw) To: Joel Stanley, openbmc, Eddie James, Guenter Roeck, Jean Delvare, linux-hwmon, linux-kernel Cc: Brandon Wyman Add a clear_faults write-only debugfs entry for the ibm-cffps device driver. Certain IBM power supplies require clearing some latched faults in order to indicate that the fault has indeed been observed/noticed. Signed-off-by: Brandon Wyman <bjwyman@gmail.com> --- V1 -> V2: Explain why this change is needed drivers/hwmon/pmbus/ibm-cffps.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c index e3294a1a54bb..3f02dde02a4b 100644 --- a/drivers/hwmon/pmbus/ibm-cffps.c +++ b/drivers/hwmon/pmbus/ibm-cffps.c @@ -67,6 +67,7 @@ enum { CFFPS_DEBUGFS_CCIN, CFFPS_DEBUGFS_FW, CFFPS_DEBUGFS_ON_OFF_CONFIG, + CFFPS_DEBUGFS_CLEAR_FAULTS, CFFPS_DEBUGFS_NUM_ENTRIES }; @@ -274,6 +275,13 @@ static ssize_t ibm_cffps_debugfs_write(struct file *file, if (rc) return rc; + rc = 1; + break; + case CFFPS_DEBUGFS_CLEAR_FAULTS: + rc = i2c_smbus_write_byte(psu->client, PMBUS_CLEAR_FAULTS); + if (rc < 0) + return rc; + rc = 1; break; default: @@ -607,6 +615,9 @@ static int ibm_cffps_probe(struct i2c_client *client) debugfs_create_file("on_off_config", 0644, ibm_cffps_dir, &psu->debugfs_entries[CFFPS_DEBUGFS_ON_OFF_CONFIG], &ibm_cffps_fops); + debugfs_create_file("clear_faults", 0200, ibm_cffps_dir, + &psu->debugfs_entries[CFFPS_DEBUGFS_CLEAR_FAULTS], + &ibm_cffps_fops); return 0; } -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry 2022-03-11 18:10 [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry Brandon Wyman @ 2022-03-14 4:36 ` Guenter Roeck 2022-03-16 20:03 ` Brandon Wyman 0 siblings, 1 reply; 7+ messages in thread From: Guenter Roeck @ 2022-03-14 4:36 UTC (permalink / raw) To: Brandon Wyman, Joel Stanley, openbmc, Eddie James, Jean Delvare, linux-hwmon, linux-kernel On 3/11/22 10:10, Brandon Wyman wrote: > Add a clear_faults write-only debugfs entry for the ibm-cffps device > driver. > > Certain IBM power supplies require clearing some latched faults in order > to indicate that the fault has indeed been observed/noticed. > That is insufficient, sorry. Please provide the affected power supplies as well as the affected faults, and confirm that the problem still exists in v5.17-rc6 or later kernels - or, more specifically, in any kernel which includes commit 35f165f08950 ("hwmon: (pmbus) Clear pmbus fault/warning bits after read"). Thanks, Guenter > Signed-off-by: Brandon Wyman <bjwyman@gmail.com> > --- > V1 -> V2: Explain why this change is needed > > drivers/hwmon/pmbus/ibm-cffps.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c > index e3294a1a54bb..3f02dde02a4b 100644 > --- a/drivers/hwmon/pmbus/ibm-cffps.c > +++ b/drivers/hwmon/pmbus/ibm-cffps.c > @@ -67,6 +67,7 @@ enum { > CFFPS_DEBUGFS_CCIN, > CFFPS_DEBUGFS_FW, > CFFPS_DEBUGFS_ON_OFF_CONFIG, > + CFFPS_DEBUGFS_CLEAR_FAULTS, > CFFPS_DEBUGFS_NUM_ENTRIES > }; > > @@ -274,6 +275,13 @@ static ssize_t ibm_cffps_debugfs_write(struct file *file, > if (rc) > return rc; > > + rc = 1; > + break; > + case CFFPS_DEBUGFS_CLEAR_FAULTS: > + rc = i2c_smbus_write_byte(psu->client, PMBUS_CLEAR_FAULTS); > + if (rc < 0) > + return rc; > + > rc = 1; > break; > default: > @@ -607,6 +615,9 @@ static int ibm_cffps_probe(struct i2c_client *client) > debugfs_create_file("on_off_config", 0644, ibm_cffps_dir, > &psu->debugfs_entries[CFFPS_DEBUGFS_ON_OFF_CONFIG], > &ibm_cffps_fops); > + debugfs_create_file("clear_faults", 0200, ibm_cffps_dir, > + &psu->debugfs_entries[CFFPS_DEBUGFS_CLEAR_FAULTS], > + &ibm_cffps_fops); > > return 0; > } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry 2022-03-14 4:36 ` Guenter Roeck @ 2022-03-16 20:03 ` Brandon Wyman 2022-03-16 20:11 ` Guenter Roeck 0 siblings, 1 reply; 7+ messages in thread From: Brandon Wyman @ 2022-03-16 20:03 UTC (permalink / raw) To: Guenter Roeck Cc: linux-hwmon, Jean Delvare, OpenBMC Maillist, Eddie James, linux-kernel, Joel Stanley On Sun, Mar 13, 2022 at 11:36 PM Guenter Roeck <linux@roeck-us.net> wrote: > > On 3/11/22 10:10, Brandon Wyman wrote: > > Add a clear_faults write-only debugfs entry for the ibm-cffps device > > driver. > > > > Certain IBM power supplies require clearing some latched faults in order > > to indicate that the fault has indeed been observed/noticed. > > > > That is insufficient, sorry. Please provide the affected power supplies as > well as the affected faults, and confirm that the problem still exists > in v5.17-rc6 or later kernels - or, more specifically, in any kernel which > includes commit 35f165f08950 ("hwmon: (pmbus) Clear pmbus fault/warning > bits after read"). > > Thanks, > Guenter Sorry for the delay in responding. I did some testing with commit 35f165f08950. I could not get that code to send the CLEAR_FAULTS command to the power supplies. I can update the commit message to be more specific about which power supplies need this CLEAR_FAULTS sent, and which faults. It is observed with the 1600W power supplies (2B1E model). The faults that latch are the VIN_UV and INPUT faults in the STATUS_WORD. The corresponding STATUS_INPUT fault bits are VIN_UV_FAULT and Unit is Off. > > > Signed-off-by: Brandon Wyman <bjwyman@gmail.com> > > --- > > V1 -> V2: Explain why this change is needed > > > > drivers/hwmon/pmbus/ibm-cffps.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c > > index e3294a1a54bb..3f02dde02a4b 100644 > > --- a/drivers/hwmon/pmbus/ibm-cffps.c > > +++ b/drivers/hwmon/pmbus/ibm-cffps.c > > @@ -67,6 +67,7 @@ enum { > > CFFPS_DEBUGFS_CCIN, > > CFFPS_DEBUGFS_FW, > > CFFPS_DEBUGFS_ON_OFF_CONFIG, > > + CFFPS_DEBUGFS_CLEAR_FAULTS, > > CFFPS_DEBUGFS_NUM_ENTRIES > > }; > > > > @@ -274,6 +275,13 @@ static ssize_t ibm_cffps_debugfs_write(struct file *file, > > if (rc) > > return rc; > > > > + rc = 1; > > + break; > > + case CFFPS_DEBUGFS_CLEAR_FAULTS: > > + rc = i2c_smbus_write_byte(psu->client, PMBUS_CLEAR_FAULTS); > > + if (rc < 0) > > + return rc; > > + > > rc = 1; > > break; > > default: > > @@ -607,6 +615,9 @@ static int ibm_cffps_probe(struct i2c_client *client) > > debugfs_create_file("on_off_config", 0644, ibm_cffps_dir, > > &psu->debugfs_entries[CFFPS_DEBUGFS_ON_OFF_CONFIG], > > &ibm_cffps_fops); > > + debugfs_create_file("clear_faults", 0200, ibm_cffps_dir, > > + &psu->debugfs_entries[CFFPS_DEBUGFS_CLEAR_FAULTS], > > + &ibm_cffps_fops); > > > > return 0; > > } > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry 2022-03-16 20:03 ` Brandon Wyman @ 2022-03-16 20:11 ` Guenter Roeck 2022-03-17 16:12 ` Brandon Wyman 0 siblings, 1 reply; 7+ messages in thread From: Guenter Roeck @ 2022-03-16 20:11 UTC (permalink / raw) To: Brandon Wyman Cc: linux-hwmon, Jean Delvare, OpenBMC Maillist, Eddie James, linux-kernel, Joel Stanley On 3/16/22 13:03, Brandon Wyman wrote: > On Sun, Mar 13, 2022 at 11:36 PM Guenter Roeck <linux@roeck-us.net> wrote: >> >> On 3/11/22 10:10, Brandon Wyman wrote: >>> Add a clear_faults write-only debugfs entry for the ibm-cffps device >>> driver. >>> >>> Certain IBM power supplies require clearing some latched faults in order >>> to indicate that the fault has indeed been observed/noticed. >>> >> >> That is insufficient, sorry. Please provide the affected power supplies as >> well as the affected faults, and confirm that the problem still exists >> in v5.17-rc6 or later kernels - or, more specifically, in any kernel which >> includes commit 35f165f08950 ("hwmon: (pmbus) Clear pmbus fault/warning >> bits after read"). >> >> Thanks, >> Guenter > > Sorry for the delay in responding. I did some testing with commit > 35f165f08950. I could not get that code to send the CLEAR_FAULTS > command to the power supplies. > > I can update the commit message to be more specific about which power > supplies need this CLEAR_FAULTS sent, and which faults. It is observed > with the 1600W power supplies (2B1E model). The faults that latch are > the VIN_UV and INPUT faults in the STATUS_WORD. The corresponding > STATUS_INPUT fault bits are VIN_UV_FAULT and Unit is Off. > The point is that the respective fault bits should be reset when the corresponding alarm attributes are read. This isn't about executing a CLEAR_FAULTS command, but about selectively resetting fault bits while ensuring that faults are reported at least once. Executing CLEAR_FAULTS is a big hammer. With the patch I pointed to in place, input (and other) faults should be reset after the corresponding alarm attributes are read, assuming that the condition no longer exists. If that does not happen, we should fix the problem instead of deploying the big hammer. Thanks, Guenter >> >>> Signed-off-by: Brandon Wyman <bjwyman@gmail.com> >>> --- >>> V1 -> V2: Explain why this change is needed >>> >>> drivers/hwmon/pmbus/ibm-cffps.c | 11 +++++++++++ >>> 1 file changed, 11 insertions(+) >>> >>> diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c >>> index e3294a1a54bb..3f02dde02a4b 100644 >>> --- a/drivers/hwmon/pmbus/ibm-cffps.c >>> +++ b/drivers/hwmon/pmbus/ibm-cffps.c >>> @@ -67,6 +67,7 @@ enum { >>> CFFPS_DEBUGFS_CCIN, >>> CFFPS_DEBUGFS_FW, >>> CFFPS_DEBUGFS_ON_OFF_CONFIG, >>> + CFFPS_DEBUGFS_CLEAR_FAULTS, >>> CFFPS_DEBUGFS_NUM_ENTRIES >>> }; >>> >>> @@ -274,6 +275,13 @@ static ssize_t ibm_cffps_debugfs_write(struct file *file, >>> if (rc) >>> return rc; >>> >>> + rc = 1; >>> + break; >>> + case CFFPS_DEBUGFS_CLEAR_FAULTS: >>> + rc = i2c_smbus_write_byte(psu->client, PMBUS_CLEAR_FAULTS); >>> + if (rc < 0) >>> + return rc; >>> + >>> rc = 1; >>> break; >>> default: >>> @@ -607,6 +615,9 @@ static int ibm_cffps_probe(struct i2c_client *client) >>> debugfs_create_file("on_off_config", 0644, ibm_cffps_dir, >>> &psu->debugfs_entries[CFFPS_DEBUGFS_ON_OFF_CONFIG], >>> &ibm_cffps_fops); >>> + debugfs_create_file("clear_faults", 0200, ibm_cffps_dir, >>> + &psu->debugfs_entries[CFFPS_DEBUGFS_CLEAR_FAULTS], >>> + &ibm_cffps_fops); >>> >>> return 0; >>> } >> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry 2022-03-16 20:11 ` Guenter Roeck @ 2022-03-17 16:12 ` Brandon Wyman 2022-03-17 18:50 ` Guenter Roeck 0 siblings, 1 reply; 7+ messages in thread From: Brandon Wyman @ 2022-03-17 16:12 UTC (permalink / raw) To: Guenter Roeck Cc: linux-hwmon, Jean Delvare, OpenBMC Maillist, Eddie James, linux-kernel, Joel Stanley On Wed, Mar 16, 2022 at 3:14 PM Guenter Roeck <linux@roeck-us.net> wrote: > > On 3/16/22 13:03, Brandon Wyman wrote: > > On Sun, Mar 13, 2022 at 11:36 PM Guenter Roeck <linux@roeck-us.net> wrote: > >> > >> On 3/11/22 10:10, Brandon Wyman wrote: > >>> Add a clear_faults write-only debugfs entry for the ibm-cffps device > >>> driver. > >>> > >>> Certain IBM power supplies require clearing some latched faults in order > >>> to indicate that the fault has indeed been observed/noticed. > >>> > >> > >> That is insufficient, sorry. Please provide the affected power supplies as > >> well as the affected faults, and confirm that the problem still exists > >> in v5.17-rc6 or later kernels - or, more specifically, in any kernel which > >> includes commit 35f165f08950 ("hwmon: (pmbus) Clear pmbus fault/warning > >> bits after read"). > >> > >> Thanks, > >> Guenter > > > > Sorry for the delay in responding. I did some testing with commit > > 35f165f08950. I could not get that code to send the CLEAR_FAULTS > > command to the power supplies. > > > > I can update the commit message to be more specific about which power > > supplies need this CLEAR_FAULTS sent, and which faults. It is observed > > with the 1600W power supplies (2B1E model). The faults that latch are > > the VIN_UV and INPUT faults in the STATUS_WORD. The corresponding > > STATUS_INPUT fault bits are VIN_UV_FAULT and Unit is Off. > > > > The point is that the respective fault bits should be reset when the > corresponding alarm attributes are read. This isn't about executing > a CLEAR_FAULTS command, but about selectively resetting fault bits > while ensuring that faults are reported at least once. Executing > CLEAR_FAULTS is a big hammer. > > With the patch I pointed to in place, input (and other) faults should > be reset after the corresponding alarm attributes are read, assuming > that the condition no longer exists. If that does not happen, we should > fix the problem instead of deploying the big hammer. > > Thanks, > Guenter Okay, I see what you are pointing out there. I had been mostly looking at the "files" in the debugfs paths. Those do not end up running through that pmbus_get_boolean() function, so the individual fault clearing was not being attempted. The fault I was interested in appears to be associated with in1_lcrti_alarm. Reading that will give me a 1 if there is a VIN_UV fault, and then it sends 0x10 to STATUS_INPUT. That clears out VIN_UV, but the STATUS_INPUT command was returning 0x18. Nothing appears to handle clearing BIT(3), that 0x08 mask. Should there be some kind of define for BIT(3) over in pmbus.h? Something like PB_VOLTAGE_OFF? Somehow we need something using that in sbit of the attributes. I had a quick hack that just OR'ed BIT(3) with BIT(4) for that PB_VOLTAGE_UV_FAULT. That resulted in a clear of both bits in STATUS_INPUT, and the faults clearing in STATUS_WORD. It is not clear if there should be a separate alarm for that "Unit Off For Insufficient Input Voltage", or if the one for in1_lcrit_alarm could just be the two bits OR'ed into one mask. I can send a patch with a proposal on how to fix this one bit not getting cleared. > > >> > >>> Signed-off-by: Brandon Wyman <bjwyman@gmail.com> > >>> --- > >>> V1 -> V2: Explain why this change is needed > >>> > >>> drivers/hwmon/pmbus/ibm-cffps.c | 11 +++++++++++ > >>> 1 file changed, 11 insertions(+) > >>> > >>> diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c > >>> index e3294a1a54bb..3f02dde02a4b 100644 > >>> --- a/drivers/hwmon/pmbus/ibm-cffps.c > >>> +++ b/drivers/hwmon/pmbus/ibm-cffps.c > >>> @@ -67,6 +67,7 @@ enum { > >>> CFFPS_DEBUGFS_CCIN, > >>> CFFPS_DEBUGFS_FW, > >>> CFFPS_DEBUGFS_ON_OFF_CONFIG, > >>> + CFFPS_DEBUGFS_CLEAR_FAULTS, > >>> CFFPS_DEBUGFS_NUM_ENTRIES > >>> }; > >>> > >>> @@ -274,6 +275,13 @@ static ssize_t ibm_cffps_debugfs_write(struct file *file, > >>> if (rc) > >>> return rc; > >>> > >>> + rc = 1; > >>> + break; > >>> + case CFFPS_DEBUGFS_CLEAR_FAULTS: > >>> + rc = i2c_smbus_write_byte(psu->client, PMBUS_CLEAR_FAULTS); > >>> + if (rc < 0) > >>> + return rc; > >>> + > >>> rc = 1; > >>> break; > >>> default: > >>> @@ -607,6 +615,9 @@ static int ibm_cffps_probe(struct i2c_client *client) > >>> debugfs_create_file("on_off_config", 0644, ibm_cffps_dir, > >>> &psu->debugfs_entries[CFFPS_DEBUGFS_ON_OFF_CONFIG], > >>> &ibm_cffps_fops); > >>> + debugfs_create_file("clear_faults", 0200, ibm_cffps_dir, > >>> + &psu->debugfs_entries[CFFPS_DEBUGFS_CLEAR_FAULTS], > >>> + &ibm_cffps_fops); > >>> > >>> return 0; > >>> } > >> > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry 2022-03-17 16:12 ` Brandon Wyman @ 2022-03-17 18:50 ` Guenter Roeck 2022-03-17 23:30 ` Brandon Wyman 0 siblings, 1 reply; 7+ messages in thread From: Guenter Roeck @ 2022-03-17 18:50 UTC (permalink / raw) To: Brandon Wyman Cc: linux-hwmon, Jean Delvare, OpenBMC Maillist, Eddie James, linux-kernel, Joel Stanley On 3/17/22 09:12, Brandon Wyman wrote: > On Wed, Mar 16, 2022 at 3:14 PM Guenter Roeck <linux@roeck-us.net> wrote: >> >> On 3/16/22 13:03, Brandon Wyman wrote: >>> On Sun, Mar 13, 2022 at 11:36 PM Guenter Roeck <linux@roeck-us.net> wrote: >>>> >>>> On 3/11/22 10:10, Brandon Wyman wrote: >>>>> Add a clear_faults write-only debugfs entry for the ibm-cffps device >>>>> driver. >>>>> >>>>> Certain IBM power supplies require clearing some latched faults in order >>>>> to indicate that the fault has indeed been observed/noticed. >>>>> >>>> >>>> That is insufficient, sorry. Please provide the affected power supplies as >>>> well as the affected faults, and confirm that the problem still exists >>>> in v5.17-rc6 or later kernels - or, more specifically, in any kernel which >>>> includes commit 35f165f08950 ("hwmon: (pmbus) Clear pmbus fault/warning >>>> bits after read"). >>>> >>>> Thanks, >>>> Guenter >>> >>> Sorry for the delay in responding. I did some testing with commit >>> 35f165f08950. I could not get that code to send the CLEAR_FAULTS >>> command to the power supplies. >>> >>> I can update the commit message to be more specific about which power >>> supplies need this CLEAR_FAULTS sent, and which faults. It is observed >>> with the 1600W power supplies (2B1E model). The faults that latch are >>> the VIN_UV and INPUT faults in the STATUS_WORD. The corresponding >>> STATUS_INPUT fault bits are VIN_UV_FAULT and Unit is Off. >>> >> >> The point is that the respective fault bits should be reset when the >> corresponding alarm attributes are read. This isn't about executing >> a CLEAR_FAULTS command, but about selectively resetting fault bits >> while ensuring that faults are reported at least once. Executing >> CLEAR_FAULTS is a big hammer. >> >> With the patch I pointed to in place, input (and other) faults should >> be reset after the corresponding alarm attributes are read, assuming >> that the condition no longer exists. If that does not happen, we should >> fix the problem instead of deploying the big hammer. >> >> Thanks, >> Guenter > > Okay, I see what you are pointing out there. I had been mostly looking > at the "files" in the debugfs paths. Those do not end up running > through that pmbus_get_boolean() function, so the individual fault > clearing was not being attempted. The fault I was interested in > appears to be associated with in1_lcrti_alarm. Reading that will give > me a 1 if there is a VIN_UV fault, and then it sends 0x10 to > STATUS_INPUT. That clears out VIN_UV, but the STATUS_INPUT command was > returning 0x18. Nothing appears to handle clearing BIT(3), that 0x08 > mask. > > Should there be some kind of define for BIT(3) over in pmbus.h? > Something like PB_VOLTAGE_OFF? Somehow we need something using that in > sbit of the attributes. I had a quick hack that just OR'ed BIT(3) with > BIT(4) for that PB_VOLTAGE_UV_FAULT. That resulted in a clear of both > bits in STATUS_INPUT, and the faults clearing in STATUS_WORD. > > It is not clear if there should be a separate alarm for that "Unit Off > For Insufficient Input Voltage", or if the one for in1_lcrit_alarm > could just be the two bits OR'ed into one mask. I can send a patch > with a proposal on how to fix this one bit not getting cleared. > We don't have a separate standard attribute. I think the best approach would be to add a mask for bit 3 and or that mask for lcrit in vin_limit_attrs with PB_VOLTAGE_UV_FAULT. I'd suggest to name the define something like PB_VOLTAGE_VIN_OFF or PB_VOLTAGE_VIN_FAULT to clarify that the bit applies to the input. Thanks, Guenter ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry 2022-03-17 18:50 ` Guenter Roeck @ 2022-03-17 23:30 ` Brandon Wyman 0 siblings, 0 replies; 7+ messages in thread From: Brandon Wyman @ 2022-03-17 23:30 UTC (permalink / raw) To: Guenter Roeck Cc: linux-hwmon, Jean Delvare, OpenBMC Maillist, Eddie James, linux-kernel, Joel Stanley On Thu, Mar 17, 2022 at 1:50 PM Guenter Roeck <linux@roeck-us.net> wrote: > > On 3/17/22 09:12, Brandon Wyman wrote: > > On Wed, Mar 16, 2022 at 3:14 PM Guenter Roeck <linux@roeck-us.net> wrote: > >> > >> On 3/16/22 13:03, Brandon Wyman wrote: > >>> On Sun, Mar 13, 2022 at 11:36 PM Guenter Roeck <linux@roeck-us.net> wrote: > >>>> > >>>> On 3/11/22 10:10, Brandon Wyman wrote: > >>>>> Add a clear_faults write-only debugfs entry for the ibm-cffps device > >>>>> driver. > >>>>> > >>>>> Certain IBM power supplies require clearing some latched faults in order > >>>>> to indicate that the fault has indeed been observed/noticed. > >>>>> > >>>> > >>>> That is insufficient, sorry. Please provide the affected power supplies as > >>>> well as the affected faults, and confirm that the problem still exists > >>>> in v5.17-rc6 or later kernels - or, more specifically, in any kernel which > >>>> includes commit 35f165f08950 ("hwmon: (pmbus) Clear pmbus fault/warning > >>>> bits after read"). > >>>> > >>>> Thanks, > >>>> Guenter > >>> > >>> Sorry for the delay in responding. I did some testing with commit > >>> 35f165f08950. I could not get that code to send the CLEAR_FAULTS > >>> command to the power supplies. > >>> > >>> I can update the commit message to be more specific about which power > >>> supplies need this CLEAR_FAULTS sent, and which faults. It is observed > >>> with the 1600W power supplies (2B1E model). The faults that latch are > >>> the VIN_UV and INPUT faults in the STATUS_WORD. The corresponding > >>> STATUS_INPUT fault bits are VIN_UV_FAULT and Unit is Off. > >>> > >> > >> The point is that the respective fault bits should be reset when the > >> corresponding alarm attributes are read. This isn't about executing > >> a CLEAR_FAULTS command, but about selectively resetting fault bits > >> while ensuring that faults are reported at least once. Executing > >> CLEAR_FAULTS is a big hammer. > >> > >> With the patch I pointed to in place, input (and other) faults should > >> be reset after the corresponding alarm attributes are read, assuming > >> that the condition no longer exists. If that does not happen, we should > >> fix the problem instead of deploying the big hammer. > >> > >> Thanks, > >> Guenter > > > > Okay, I see what you are pointing out there. I had been mostly looking > > at the "files" in the debugfs paths. Those do not end up running > > through that pmbus_get_boolean() function, so the individual fault > > clearing was not being attempted. The fault I was interested in > > appears to be associated with in1_lcrti_alarm. Reading that will give > > me a 1 if there is a VIN_UV fault, and then it sends 0x10 to > > STATUS_INPUT. That clears out VIN_UV, but the STATUS_INPUT command was > > returning 0x18. Nothing appears to handle clearing BIT(3), that 0x08 > > mask. > > > > Should there be some kind of define for BIT(3) over in pmbus.h? > > Something like PB_VOLTAGE_OFF? Somehow we need something using that in > > sbit of the attributes. I had a quick hack that just OR'ed BIT(3) with > > BIT(4) for that PB_VOLTAGE_UV_FAULT. That resulted in a clear of both > > bits in STATUS_INPUT, and the faults clearing in STATUS_WORD. > > > > It is not clear if there should be a separate alarm for that "Unit Off > > For Insufficient Input Voltage", or if the one for in1_lcrit_alarm > > could just be the two bits OR'ed into one mask. I can send a patch > > with a proposal on how to fix this one bit not getting cleared. > > > > We don't have a separate standard attribute. I think the best approach > would be to add a mask for bit 3 and or that mask for lcrit in > vin_limit_attrs with PB_VOLTAGE_UV_FAULT. I'd suggest to name the > define something like PB_VOLTAGE_VIN_OFF or PB_VOLTAGE_VIN_FAULT > to clarify that the bit applies to the input. Done. See: https://lore.kernel.org/linux-hwmon/20220317232123.2103592-1-bjwyman@gmail.com/T/#u > > Thanks, > Guenter ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-03-17 23:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-03-11 18:10 [PATCH v2] hwmon: (pmbus/ibm-cffps) Add clear_faults debugfs entry Brandon Wyman 2022-03-14 4:36 ` Guenter Roeck 2022-03-16 20:03 ` Brandon Wyman 2022-03-16 20:11 ` Guenter Roeck 2022-03-17 16:12 ` Brandon Wyman 2022-03-17 18:50 ` Guenter Roeck 2022-03-17 23:30 ` Brandon Wyman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).