All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme-host: fix the updating of the firmware version
@ 2024-01-18 11:48 Maurizio Lombardi
  2024-01-18 12:31 ` Daniel Wagner
  2024-02-02  1:48 ` Keith Busch
  0 siblings, 2 replies; 6+ messages in thread
From: Maurizio Lombardi @ 2024-01-18 11:48 UTC (permalink / raw)
  To: kbusch; +Cc: dwagner, hch, Niklas.Cassel, chaitanyak, linux-nvme

The original code didn't update the firmware version if the
"next slot" of the AFI register isn't zero or if the
"current slot" field is zero; in those cases it assumed
that a reset was needed.

However, the NVMe specification doesn't exclude the possibility that
the "next slot" value is equal to the "current slot" value,
meaning that the same firmware slot will be activated after performing
a controller level reset; in this case a reset is clearly not
necessary and we can safely update the firmware version.

Modify the code so the kernel will report that a Controller Level Reset
is needed only in the following cases:

1) If the "current slot" field is zero. This is invalid and means that
   something is wrong, a reset is needed.

or

2) if the "next slot" field isn't zero AND it's not equal to the
   "current slot" value. This means that at the next reset a different
   firmware slot will be activated.

Fixes: 983a338b96c8 ("nvme: update firmware version after commit")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 drivers/nvme/host/core.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0af612387083..44b0f665d49d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4139,6 +4139,7 @@ static bool nvme_ctrl_pp_status(struct nvme_ctrl *ctrl)
 static void nvme_get_fw_slot_info(struct nvme_ctrl *ctrl)
 {
 	struct nvme_fw_slot_info_log *log;
+	u8 next_fw_slot, cur_fw_slot;
 
 	log = kmalloc(sizeof(*log), GFP_KERNEL);
 	if (!log)
@@ -4150,13 +4151,15 @@ static void nvme_get_fw_slot_info(struct nvme_ctrl *ctrl)
 		goto out_free_log;
 	}
 
-	if (log->afi & 0x70 || !(log->afi & 0x7)) {
+	cur_fw_slot = log->afi & 0x7;
+	next_fw_slot = (log->afi & 0x70) >> 4;
+	if (!cur_fw_slot || (next_fw_slot && (cur_fw_slot != next_fw_slot))) {
 		dev_info(ctrl->device,
 			 "Firmware is activated after next Controller Level Reset\n");
 		goto out_free_log;
 	}
 
-	memcpy(ctrl->subsys->firmware_rev, &log->frs[(log->afi & 0x7) - 1],
+	memcpy(ctrl->subsys->firmware_rev, &log->frs[cur_fw_slot - 1],
 		sizeof(ctrl->subsys->firmware_rev));
 
 out_free_log:
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme-host: fix the updating of the firmware version
  2024-01-18 11:48 [PATCH] nvme-host: fix the updating of the firmware version Maurizio Lombardi
@ 2024-01-18 12:31 ` Daniel Wagner
  2024-01-18 13:21   ` Maurizio Lombardi
  2024-02-02  1:48 ` Keith Busch
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2024-01-18 12:31 UTC (permalink / raw)
  To: Maurizio Lombardi; +Cc: kbusch, hch, Niklas.Cassel, chaitanyak, linux-nvme

On Thu, Jan 18, 2024 at 12:48:54PM +0100, Maurizio Lombardi wrote:
> The original code didn't update the firmware version if the
> "next slot" of the AFI register isn't zero or if the
> "current slot" field is zero; in those cases it assumed
> that a reset was needed.
> 
> However, the NVMe specification doesn't exclude the possibility that
> the "next slot" value is equal to the "current slot" value,
> meaning that the same firmware slot will be activated after performing
> a controller level reset; in this case a reset is clearly not
> necessary and we can safely update the firmware version.

Is this not an undefined behavior if it is not in the spec? Just
saying this is not so clear to me.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme-host: fix the updating of the firmware version
  2024-01-18 12:31 ` Daniel Wagner
@ 2024-01-18 13:21   ` Maurizio Lombardi
  2024-01-19  8:42     ` Daniel Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: Maurizio Lombardi @ 2024-01-18 13:21 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: kbusch, hch, Niklas.Cassel, chaitanyak, linux-nvme

čt 18. 1. 2024 v 13:31 odesílatel Daniel Wagner <dwagner@suse.de> napsal:
>
> Is this not an undefined behavior if it is not in the spec? Just
> saying this is not so clear to me.
>

I don't think it is relevant in this case.
The spec just says that bits 2:0 are the active firmware slot and
bits 6:4 are the next firmware slot, it doesn't say that
they must have different values.

What the spec leaves undefined is what does it mean a "current fw slot"
with value zero, which is invalid.

By the way, I submitted this patch because we have a customer that complained
that after a firmware update he noticed a discrepancy between the fw
version reported
by the kernel and the one reported by nvme-cli.
He says that this patch fixed the error.

Maurizio



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [PATCH] nvme-host: fix the updating of the firmware version
  2024-01-18 13:21   ` Maurizio Lombardi
@ 2024-01-19  8:42     ` Daniel Wagner
  2024-01-19 12:08       ` Maurizio Lombardi
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2024-01-19  8:42 UTC (permalink / raw)
  To: Maurizio Lombardi; +Cc: kbusch, hch, Niklas.Cassel, chaitanyak, linux-nvme

On Thu, Jan 18, 2024 at 02:21:03PM +0100, Maurizio Lombardi wrote:
> čt 18. 1. 2024 v 13:31 odesílatel Daniel Wagner <dwagner@suse.de> napsal:
> >
> > Is this not an undefined behavior if it is not in the spec? Just
> > saying this is not so clear to me.
> >
> 
> I don't think it is relevant in this case.
> The spec just says that bits 2:0 are the active firmware slot and
> bits 6:4 are the next firmware slot, it doesn't say that
> they must have different values.

I see.

> What the spec leaves undefined is what does it mean a "current fw slot"
> with value zero, which is invalid.

The valid range is 1-7 for the slot ids, that rules out the value
zero IMO. Yeah, I agree not a thing at all.

> By the way, I submitted this patch because we have a customer that complained
> that after a firmware update he noticed a discrepancy between the fw
> version reported
> by the kernel and the one reported by nvme-cli.
> He says that this patch fixed the error.

I suspected this but it was not mentioned in the commit message. Good to
know that it fixes something. In this case,

Reviewed-by: Daniel Wagner <dwagner@suse.de>

Thanks,
Daniel


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [PATCH] nvme-host: fix the updating of the firmware version
  2024-01-19  8:42     ` Daniel Wagner
@ 2024-01-19 12:08       ` Maurizio Lombardi
  0 siblings, 0 replies; 6+ messages in thread
From: Maurizio Lombardi @ 2024-01-19 12:08 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: kbusch, hch, Niklas.Cassel, chaitanyak, linux-nvme

pá 19. 1. 2024 v 9:42 odesílatel Daniel Wagner <dwagner@suse.de> napsal:
>
> On Thu, Jan 18, 2024 at 02:21:03PM +0100, Maurizio Lombardi wrote:
> > čt 18. 1. 2024 v 13:31 odesílatel Daniel Wagner <dwagner@suse.de> napsal:
> > >
>
> > What the spec leaves undefined is what does it mean a "current fw slot"
> > with value zero, which is invalid.
>
> The valid range is 1-7 for the slot ids, that rules out the value
> zero IMO. Yeah, I agree not a thing at all.

Yeah I was just pointing to the difference here with the "next slot", the value
zero in that case is valid and defined as "The controller doesn't
provide this info".

> I suspected this but it was not mentioned in the commit message. Good to
> know that it fixes something. In this case,
>
> Reviewed-by: Daniel Wagner <dwagner@suse.de>

Thanks.

Maurizio



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme-host: fix the updating of the firmware version
  2024-01-18 11:48 [PATCH] nvme-host: fix the updating of the firmware version Maurizio Lombardi
  2024-01-18 12:31 ` Daniel Wagner
@ 2024-02-02  1:48 ` Keith Busch
  1 sibling, 0 replies; 6+ messages in thread
From: Keith Busch @ 2024-02-02  1:48 UTC (permalink / raw)
  To: Maurizio Lombardi; +Cc: dwagner, hch, Niklas.Cassel, chaitanyak, linux-nvme

On Thu, Jan 18, 2024 at 12:48:54PM +0100, Maurizio Lombardi wrote:
> The original code didn't update the firmware version if the
> "next slot" of the AFI register isn't zero or if the
> "current slot" field is zero; in those cases it assumed
> that a reset was needed.
> 
> However, the NVMe specification doesn't exclude the possibility that
> the "next slot" value is equal to the "current slot" value,
> meaning that the same firmware slot will be activated after performing
> a controller level reset; in this case a reset is clearly not
> necessary and we can safely update the firmware version.
> 
> Modify the code so the kernel will report that a Controller Level Reset
> is needed only in the following cases:
> 
> 1) If the "current slot" field is zero. This is invalid and means that
>    something is wrong, a reset is needed.
> 
> or
> 
> 2) if the "next slot" field isn't zero AND it's not equal to the
>    "current slot" value. This means that at the next reset a different
>    firmware slot will be activated.
> 
> Fixes: 983a338b96c8 ("nvme: update firmware version after commit")
> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>

Sorry I missed this one for this weeks pull request. It's now applied
for nvme-6.8 as the first commit for the next round.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-02  1:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-18 11:48 [PATCH] nvme-host: fix the updating of the firmware version Maurizio Lombardi
2024-01-18 12:31 ` Daniel Wagner
2024-01-18 13:21   ` Maurizio Lombardi
2024-01-19  8:42     ` Daniel Wagner
2024-01-19 12:08       ` Maurizio Lombardi
2024-02-02  1:48 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.