All of lore.kernel.org
 help / color / mirror / Atom feed
* spi-nor: excessive locking in spi_nor_erase()
@ 2024-04-15 15:55 Joakim Tjernlund
  2024-04-16 15:44 ` Pratyush Yadav
  0 siblings, 1 reply; 7+ messages in thread
From: Joakim Tjernlund @ 2024-04-15 15:55 UTC (permalink / raw)
  To: linux-mtd

static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
{
...
Lock here
	ret = spi_nor_lock_and_prep(nor);
	if (ret)
		return ret;
....
Then we have:
	} else if (spi_nor_has_uniform_erase(nor)) {
		while (len) {
			ret = spi_nor_write_enable(nor);
			if (ret)
				goto erase_err;

			ret = spi_nor_erase_sector(nor, addr);
			if (ret)
				goto erase_err;

			ret = spi_nor_wait_till_ready(nor);
			if (ret)
				goto erase_err;

			addr += mtd->erasesize;
			len -= mtd->erasesize;
		}

	/* erase multiple sectors */
	} else {
		ret = spi_nor_erase_multi_sectors(nor, addr, len);
		if (ret)
			goto erase_err;
	}
....
erase_err:
unlock here
	spi_nor_unlock_and_unprep(nor);

	return ret;
}

So erase locks the flash for the whole erase op which can include many sectors and I don't see
any Erase Suspend handling in spi-nor like the cfi_cmdset_0001/cfi_cmdset_0002 has.

Locking can be a challenge but this seems over the top, anyone working/looking into improving this?

    Jocke
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: spi-nor: excessive locking in spi_nor_erase()
  2024-04-15 15:55 spi-nor: excessive locking in spi_nor_erase() Joakim Tjernlund
@ 2024-04-16 15:44 ` Pratyush Yadav
  2024-04-16 18:03   ` Joakim Tjernlund
  0 siblings, 1 reply; 7+ messages in thread
From: Pratyush Yadav @ 2024-04-16 15:44 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linux-mtd

Hi,

On Mon, Apr 15 2024, Joakim Tjernlund wrote:

> static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
> {
> ...
> Lock here
> 	ret = spi_nor_lock_and_prep(nor);
> 	if (ret)
> 		return ret;
> ....
> Then we have:
> 	} else if (spi_nor_has_uniform_erase(nor)) {
> 		while (len) {
> 			ret = spi_nor_write_enable(nor);
> 			if (ret)
> 				goto erase_err;
>
> 			ret = spi_nor_erase_sector(nor, addr);
> 			if (ret)
> 				goto erase_err;
>
> 			ret = spi_nor_wait_till_ready(nor);
> 			if (ret)
> 				goto erase_err;
>
> 			addr += mtd->erasesize;
> 			len -= mtd->erasesize;
> 		}
>
> 	/* erase multiple sectors */
> 	} else {
> 		ret = spi_nor_erase_multi_sectors(nor, addr, len);
> 		if (ret)
> 			goto erase_err;
> 	}
> ....
> erase_err:
> unlock here
> 	spi_nor_unlock_and_unprep(nor);
>
> 	return ret;
> }

Firstly, seems like you are looking at an older version of the driver.
spi_nor_erase() looks a bit different now, though works pretty much the
same in practice. See [0].

>
> So erase locks the flash for the whole erase op which can include many sectors and I don't see
> any Erase Suspend handling in spi-nor like the cfi_cmdset_0001/cfi_cmdset_0002 has.
>
> Locking can be a challenge but this seems over the top, anyone working/looking into improving this?

I think you should lead with what the problem you see with this and what
you think can be improved. Why do you think the locking is "over the
top"? What should be improved?

Most flashes can't do any operations (except poll status register) when
a program or erase operation is in progress. So the locking here makes
sense to me. We do not want to let other tasks attempt any other
operations until the erase is done. The only exception is some
multi-bank flashes that can do erase on one bank and reads on other
banks. Miquel's series [1] takes care of those (though I do not see any
flashes using that feature yet).

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/mtd/spi-nor/core.c#n1789
[1] https://lore.kernel.org/linux-mtd/20230328154105.448540-1-miquel.raynal@bootlin.com/

-- 
Regards,
Pratyush Yadav

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: spi-nor: excessive locking in spi_nor_erase()
  2024-04-16 15:44 ` Pratyush Yadav
@ 2024-04-16 18:03   ` Joakim Tjernlund
  2024-04-17 12:42     ` Pratyush Yadav
  0 siblings, 1 reply; 7+ messages in thread
From: Joakim Tjernlund @ 2024-04-16 18:03 UTC (permalink / raw)
  To: pratyush; +Cc: linux-mtd

On Tue, 2024-04-16 at 17:44 +0200, Pratyush Yadav wrote:
> Hi,

Hi Pratyush, thanks for replying. Se below

> 
> On Mon, Apr 15 2024, Joakim Tjernlund wrote:
> 
> > static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
> > {
> > ...
> > Lock here
> > 	ret = spi_nor_lock_and_prep(nor);
> > 	if (ret)
> > 		return ret;
> > ....
> > Then we have:
> > 	} else if (spi_nor_has_uniform_erase(nor)) {
> > 		while (len) {
> > 			ret = spi_nor_write_enable(nor);
> > 			if (ret)
> > 				goto erase_err;
> > 
> > 			ret = spi_nor_erase_sector(nor, addr);
> > 			if (ret)
> > 				goto erase_err;
> > 
> > 			ret = spi_nor_wait_till_ready(nor);
> > 			if (ret)
> > 				goto erase_err;
> > 
> > 			addr += mtd->erasesize;
> > 			len -= mtd->erasesize;
> > 		}
> > 
> > 	/* erase multiple sectors */
> > 	} else {
> > 		ret = spi_nor_erase_multi_sectors(nor, addr, len);
> > 		if (ret)
> > 			goto erase_err;
> > 	}
> > ....
> > erase_err:
> > unlock here
> > 	spi_nor_unlock_and_unprep(nor);
> > 
> > 	return ret;
> > }
> 
> Firstly, seems like you are looking at an older version of the driver.
> spi_nor_erase() looks a bit different now, though works pretty much the
> same in practice. See [0].

Ah, yes. I am on 5.15.x, not ready to move forward on this product yet.

> 
> > 
> > So erase locks the flash for the whole erase op which can include many sectors and I don't see
> > any Erase Suspend handling in spi-nor like the cfi_cmdset_0001/cfi_cmdset_0002 has.
> > 
> > Locking can be a challenge but this seems over the top, anyone working/looking into improving this?
> 
> I think you should lead with what the problem you see with this and what
> you think can be improved. Why do you think the locking is "over the
> top"? What should be improved?

Sure, we have been using NOR flash as RFS(JFFS2) for many years now and it
is important that RFS is responsive at all times. Up til recently we used parallel NOR
using cfi_cmdset_0001/cfi_cmdset_0002 which has worked great.

QSPI is different beast and its driver not mature for RFS use. Consider when our
sw update starts to erase Image B partition, then the flash becomes completely locked for other
processes to use, one cannot even connect with ssh while sw update is ongoing or do an ls.

ATM I am testing this(which makes the system much more responsive:
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index bcaae9c71d90..62fe9cb97139 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -1635,6 +1635,13 @@ static int spi_nor_erase_multi_sectors(struct spi_nor *nor, u64 addr, u32 len)
                        dev_vdbg(nor->dev, "erase_cmd->size = 0x%08x, erase_cmd->opcode = 0x%02x, erase_cmd->count = %u\n",
                                 cmd->size, cmd->opcode, cmd->count);
 
+                       spi_nor_unlock_and_unprep(nor);
+                       //cond_resched();
+                       schedule();
+                       ret = spi_nor_lock_and_prep(nor);
+                       if (ret)
+                               goto destroy_erase_cmd_list;
+
                        ret = spi_nor_write_enable(nor);
                        if (ret)
                                goto destroy_erase_cmd_list;
@@ -1721,6 +1728,13 @@ static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
        /* "sector"-at-a-time erase */
        } else if (spi_nor_has_uniform_erase(nor)) {
                while (len) {
+                       spi_nor_unlock_and_unprep(nor);
+                       //cond_resched();
+                       schedule();
+                       ret = spi_nor_lock_and_prep(nor);
+                       if (ret)
+                               return ret;
+
                        ret = spi_nor_write_enable(nor);
                        if (ret)
                                goto erase_err;
@@ -1987,6 +2001,9 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
                ssize_t written;
                loff_t addr = to + i;
 
+               spi_nor_unlock_and_unprep(nor);
+               //cond_resched();
+               schedule();
                /*
                 * If page_size is a power of two, the offset can be quickly
                 * calculated with an AND operation. On the other cases we
@@ -2005,6 +2022,10 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
 
                addr = spi_nor_convert_addr(nor, addr);
 
+               ret = spi_nor_lock_and_prep(nor);
+               if (ret)
+                       return ret;
+
                ret = spi_nor_write_enable(nor);
                if (ret)
                        goto write_err;

This is s good first step but not enough, one need to support Erase Suspend(which the vast majority of flash support)
to get full responsiveness(see cfi_cmdset_0001/cfi_cmdset_0002)

> 
> Most flashes can't do any operations (except poll status register) when

Not quite, see above comment on Erase Suspend

> a program or erase operation is in progress. So the locking here makes
> sense to me. We do not want to let other tasks attempt any other
> operations until the erase is done. The only exception is some
> multi-bank flashes that can do erase on one bank and reads on other
> banks. Miquel's series [1] takes care of those (though I do not see any
> flashes using that feature yet).
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/mtd/spi-nor/core.c#n1789
> [1] https://lore.kernel.org/linux-mtd/20230328154105.448540-1-miquel.raynal@bootlin.com/
> 

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: spi-nor: excessive locking in spi_nor_erase()
  2024-04-16 18:03   ` Joakim Tjernlund
@ 2024-04-17 12:42     ` Pratyush Yadav
  2024-04-17 12:54       ` Joakim Tjernlund
  0 siblings, 1 reply; 7+ messages in thread
From: Pratyush Yadav @ 2024-04-17 12:42 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: pratyush, linux-mtd

On Tue, Apr 16 2024, Joakim Tjernlund wrote:

> On Tue, 2024-04-16 at 17:44 +0200, Pratyush Yadav wrote:
>> Hi,
>
> Hi Pratyush, thanks for replying. Se below
>
>> 
[...]
>> > 
>> > So erase locks the flash for the whole erase op which can include many sectors and I don't see
>> > any Erase Suspend handling in spi-nor like the cfi_cmdset_0001/cfi_cmdset_0002 has.
>> > 
>> > Locking can be a challenge but this seems over the top, anyone working/looking into improving this?
>> 
>> I think you should lead with what the problem you see with this and what
>> you think can be improved. Why do you think the locking is "over the
>> top"? What should be improved?
>
> Sure, we have been using NOR flash as RFS(JFFS2) for many years now and it
> is important that RFS is responsive at all times. Up til recently we used parallel NOR
> using cfi_cmdset_0001/cfi_cmdset_0002 which has worked great.
>
> QSPI is different beast and its driver not mature for RFS use. Consider when our
> sw update starts to erase Image B partition, then the flash becomes completely locked for other
> processes to use, one cannot even connect with ssh while sw update is ongoing or do an ls.

I see. I assume the rootfs is on the NOR flash. While a big erase
(spanning multiple sectors) is running, other tasks do still get to run
(since spi_nor_wait_till_ready() calls cond_resched()) but any task that
tries to access the rootfs would have to freeze since the lock is held.

Dropping and re-acquiring the lock after erasing each sector or
programming each page would let readers make progress in between. This
shouldn't be too difficult to implement I reckon. You already mostly do
it in your patch below.

Doing erase or program suspend would only make sense for flashes with
larger sectors since even one sector erase would take a long time. For
those, I suppose we could give reads priority over program or erase
operations. When a read comes in, it suspends the erase, does the read,
and then resumes it. This would be a little bit more complex to
implement.

I would suggest you try the former first and see if it already gives you
the results you need. From your patch below, it seems it does. So
perhaps just cleaning it up and turning it onto a proper patch would do
the job for you.

>
> ATM I am testing this(which makes the system much more responsive:
> diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
> index bcaae9c71d90..62fe9cb97139 100644
> --- a/drivers/mtd/spi-nor/core.c
> +++ b/drivers/mtd/spi-nor/core.c
> @@ -1635,6 +1635,13 @@ static int spi_nor_erase_multi_sectors(struct spi_nor *nor, u64 addr, u32 len)
>                         dev_vdbg(nor->dev, "erase_cmd->size = 0x%08x, erase_cmd->opcode = 0x%02x, erase_cmd->count = %u\n",
>                                  cmd->size, cmd->opcode, cmd->count);
>  
> +                       spi_nor_unlock_and_unprep(nor);
> +                       //cond_resched();
> +                       schedule();
> +                       ret = spi_nor_lock_and_prep(nor);
> +                       if (ret)
> +                               goto destroy_erase_cmd_list;
> +
>                         ret = spi_nor_write_enable(nor);
>                         if (ret)
>                                 goto destroy_erase_cmd_list;
> @@ -1721,6 +1728,13 @@ static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
>         /* "sector"-at-a-time erase */
>         } else if (spi_nor_has_uniform_erase(nor)) {
>                 while (len) {
> +                       spi_nor_unlock_and_unprep(nor);
> +                       //cond_resched();
> +                       schedule();
> +                       ret = spi_nor_lock_and_prep(nor);
> +                       if (ret)
> +                               return ret;
> +
>                         ret = spi_nor_write_enable(nor);
>                         if (ret)
>                                 goto erase_err;
> @@ -1987,6 +2001,9 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
>                 ssize_t written;
>                 loff_t addr = to + i;
>  
> +               spi_nor_unlock_and_unprep(nor);
> +               //cond_resched();
> +               schedule();
>                 /*
>                  * If page_size is a power of two, the offset can be quickly
>                  * calculated with an AND operation. On the other cases we
> @@ -2005,6 +2022,10 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
>  
>                 addr = spi_nor_convert_addr(nor, addr);
>  
> +               ret = spi_nor_lock_and_prep(nor);
> +               if (ret)
> +                       return ret;
> +
>                 ret = spi_nor_write_enable(nor);
>                 if (ret)
>                         goto write_err;
>
> This is s good first step but not enough, one need to support Erase Suspend(which the vast majority of flash support)
> to get full responsiveness(see cfi_cmdset_0001/cfi_cmdset_0002)
>
>> 
>> Most flashes can't do any operations (except poll status register) when
>
> Not quite, see above comment on Erase Suspend

Yes, I meant in the absence of a suspend operation, which SPI NOR does
not currently do.

>
>> a program or erase operation is in progress. So the locking here makes
>> sense to me. We do not want to let other tasks attempt any other
>> operations until the erase is done. The only exception is some
>> multi-bank flashes that can do erase on one bank and reads on other
>> banks. Miquel's series [1] takes care of those (though I do not see any
>> flashes using that feature yet).
>> 
>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/mtd/spi-nor/core.c#n1789
>> [1] https://lore.kernel.org/linux-mtd/20230328154105.448540-1-miquel.raynal@bootlin.com/
>> 
>

-- 
Regards,
Pratyush Yadav

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: spi-nor: excessive locking in spi_nor_erase()
  2024-04-17 12:42     ` Pratyush Yadav
@ 2024-04-17 12:54       ` Joakim Tjernlund
  2024-04-19 12:58         ` Pratyush Yadav
  0 siblings, 1 reply; 7+ messages in thread
From: Joakim Tjernlund @ 2024-04-17 12:54 UTC (permalink / raw)
  To: pratyush; +Cc: linux-mtd

On Wed, 2024-04-17 at 14:42 +0200, Pratyush Yadav wrote:
> On Tue, Apr 16 2024, Joakim Tjernlund wrote:
> 
> > On Tue, 2024-04-16 at 17:44 +0200, Pratyush Yadav wrote:
> > > Hi,
> > 
> > Hi Pratyush, thanks for replying. Se below
> > 
> > > 
> [...]
> > > > 
> > > > So erase locks the flash for the whole erase op which can include many sectors and I don't see
> > > > any Erase Suspend handling in spi-nor like the cfi_cmdset_0001/cfi_cmdset_0002 has.
> > > > 
> > > > Locking can be a challenge but this seems over the top, anyone working/looking into improving this?
> > > 
> > > I think you should lead with what the problem you see with this and what
> > > you think can be improved. Why do you think the locking is "over the
> > > top"? What should be improved?
> > 
> > Sure, we have been using NOR flash as RFS(JFFS2) for many years now and it
> > is important that RFS is responsive at all times. Up til recently we used parallel NOR
> > using cfi_cmdset_0001/cfi_cmdset_0002 which has worked great.
> > 
> > QSPI is different beast and its driver not mature for RFS use. Consider when our
> > sw update starts to erase Image B partition, then the flash becomes completely locked for other
> > processes to use, one cannot even connect with ssh while sw update is ongoing or do an ls.
> 
> I see. I assume the rootfs is on the NOR flash. While a big erase

Yes, same flash chip.

> (spanning multiple sectors) is running, other tasks do still get to run
> (since spi_nor_wait_till_ready() calls cond_resched()) but any task that
> tries to access the rootfs would have to freeze since the lock is held.
> 
> Dropping and re-acquiring the lock after erasing each sector or
> programming each page would let readers make progress in between. This
> shouldn't be too difficult to implement I reckon. You already mostly do
> it in your patch below.

Yes, that is simple.

> 
> Doing erase or program suspend would only make sense for flashes with
> larger sectors since even one sector erase would take a long time. For
> those, I suppose we could give reads priority over program or erase
> operations. When a read comes in, it suspends the erase, does the read,
> and then resumes it. This would be a little bit more complex to
> implement.

Right, a bit harder but I think it is needed like cfi_cmdset_0001 do: suspend erase to do read/write
Don't think cfi_cmdset suspends writes, that seems overkill.
> 
> I would suggest you try the former first and see if it already gives you
> the results you need. From your patch below, it seems it does. So
> perhaps just cleaning it up and turning it onto a proper patch would do
> the job for you.

Tests with my patch shows vastly improved responsivity but there is still some
way to go. Consider that an erase takes some 0.20-0.25 secs for us in best case in
which apps are blocked. We do get some complaints from the app taking too long to complete
some tasks during erase. 

> 
> > 
> > ATM I am testing this(which makes the system much more responsive:
> > diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
> > index bcaae9c71d90..62fe9cb97139 100644
> > --- a/drivers/mtd/spi-nor/core.c
> > +++ b/drivers/mtd/spi-nor/core.c
> > @@ -1635,6 +1635,13 @@ static int spi_nor_erase_multi_sectors(struct spi_nor *nor, u64 addr, u32 len)
> >                         dev_vdbg(nor->dev, "erase_cmd->size = 0x%08x, erase_cmd->opcode = 0x%02x, erase_cmd->count = %u\n",
> >                                  cmd->size, cmd->opcode, cmd->count);
> >  
> > +                       spi_nor_unlock_and_unprep(nor);
> > +                       //cond_resched();
> > +                       schedule();
> > +                       ret = spi_nor_lock_and_prep(nor);
> > +                       if (ret)
> > +                               goto destroy_erase_cmd_list;
> > +
> >                         ret = spi_nor_write_enable(nor);
> >                         if (ret)
> >                                 goto destroy_erase_cmd_list;
> > @@ -1721,6 +1728,13 @@ static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
> >         /* "sector"-at-a-time erase */
> >         } else if (spi_nor_has_uniform_erase(nor)) {
> >                 while (len) {
> > +                       spi_nor_unlock_and_unprep(nor);
> > +                       //cond_resched();
> > +                       schedule();
> > +                       ret = spi_nor_lock_and_prep(nor);
> > +                       if (ret)
> > +                               return ret;
> > +
> >                         ret = spi_nor_write_enable(nor);
> >                         if (ret)
> >                                 goto erase_err;
> > @@ -1987,6 +2001,9 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
> >                 ssize_t written;
> >                 loff_t addr = to + i;
> >  
> > +               spi_nor_unlock_and_unprep(nor);
> > +               //cond_resched();
> > +               schedule();
> >                 /*
> >                  * If page_size is a power of two, the offset can be quickly
> >                  * calculated with an AND operation. On the other cases we
> > @@ -2005,6 +2022,10 @@ static int spi_nor_write(struct mtd_info *mtd, loff_t to, size_t len,
> >  
> >                 addr = spi_nor_convert_addr(nor, addr);
> >  
> > +               ret = spi_nor_lock_and_prep(nor);
> > +               if (ret)
> > +                       return ret;
> > +
> >                 ret = spi_nor_write_enable(nor);
> >                 if (ret)
> >                         goto write_err;
> > 
> > This is s good first step but not enough, one need to support Erase Suspend(which the vast majority of flash support)
> > to get full responsiveness(see cfi_cmdset_0001/cfi_cmdset_0002)
> > 
> > > 
> > > Most flashes can't do any operations (except poll status register) when
> > 
> > Not quite, see above comment on Erase Suspend
> 
> Yes, I meant in the absence of a suspend operation, which SPI NOR does
> not currently do.
> 
> > 
> > > a program or erase operation is in progress. So the locking here makes
> > > sense to me. We do not want to let other tasks attempt any other
> > > operations until the erase is done. The only exception is some
> > > multi-bank flashes that can do erase on one bank and reads on other
> > > banks. Miquel's series [1] takes care of those (though I do not see any
> > > flashes using that feature yet).
> > > 
> > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/mtd/spi-nor/core.c#n1789
> > > [1] https://lore.kernel.org/linux-mtd/20230328154105.448540-1-miquel.raynal@bootlin.com/
> > > 
> > 
> 

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: spi-nor: excessive locking in spi_nor_erase()
  2024-04-17 12:54       ` Joakim Tjernlund
@ 2024-04-19 12:58         ` Pratyush Yadav
  2024-04-19 13:14           ` Joakim Tjernlund
  0 siblings, 1 reply; 7+ messages in thread
From: Pratyush Yadav @ 2024-04-19 12:58 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: pratyush, linux-mtd

On Wed, Apr 17 2024, Joakim Tjernlund wrote:

[...]
>> (spanning multiple sectors) is running, other tasks do still get to run
>> (since spi_nor_wait_till_ready() calls cond_resched()) but any task that
>> tries to access the rootfs would have to freeze since the lock is held.
>> 
>> Dropping and re-acquiring the lock after erasing each sector or
>> programming each page would let readers make progress in between. This
>> shouldn't be too difficult to implement I reckon. You already mostly do
>> it in your patch below.
>
> Yes, that is simple.
>
>> 
>> Doing erase or program suspend would only make sense for flashes with
>> larger sectors since even one sector erase would take a long time. For
>> those, I suppose we could give reads priority over program or erase
>> operations. When a read comes in, it suspends the erase, does the read,
>> and then resumes it. This would be a little bit more complex to
>> implement.
>
> Right, a bit harder but I think it is needed like cfi_cmdset_0001 do: suspend erase to do read/write
> Don't think cfi_cmdset suspends writes, that seems overkill.
>> 
>> I would suggest you try the former first and see if it already gives you
>> the results you need. From your patch below, it seems it does. So
>> perhaps just cleaning it up and turning it onto a proper patch would do
>> the job for you.
>
> Tests with my patch shows vastly improved responsivity but there is still some
> way to go. Consider that an erase takes some 0.20-0.25 secs for us in best case in
> which apps are blocked. We do get some complaints from the app taking too long to complete
> some tasks during erase. 

Send some patches to fix this then :-)

[...]

-- 
Regards,
Pratyush Yadav

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: spi-nor: excessive locking in spi_nor_erase()
  2024-04-19 12:58         ` Pratyush Yadav
@ 2024-04-19 13:14           ` Joakim Tjernlund
  0 siblings, 0 replies; 7+ messages in thread
From: Joakim Tjernlund @ 2024-04-19 13:14 UTC (permalink / raw)
  To: pratyush; +Cc: linux-mtd

On Fri, 2024-04-19 at 14:58 +0200, Pratyush Yadav wrote:
> On Wed, Apr 17 2024, Joakim Tjernlund wrote:
> 
> [...]
> > > (spanning multiple sectors) is running, other tasks do still get to run
> > > (since spi_nor_wait_till_ready() calls cond_resched()) but any task that
> > > tries to access the rootfs would have to freeze since the lock is held.
> > > 
> > > Dropping and re-acquiring the lock after erasing each sector or
> > > programming each page would let readers make progress in between. This
> > > shouldn't be too difficult to implement I reckon. You already mostly do
> > > it in your patch below.
> > 
> > Yes, that is simple.
> > 
> > > 
> > > Doing erase or program suspend would only make sense for flashes with
> > > larger sectors since even one sector erase would take a long time. For
> > > those, I suppose we could give reads priority over program or erase
> > > operations. When a read comes in, it suspends the erase, does the read,
> > > and then resumes it. This would be a little bit more complex to
> > > implement.
> > 
> > Right, a bit harder but I think it is needed like cfi_cmdset_0001 do: suspend erase to do read/write
> > Don't think cfi_cmdset suspends writes, that seems overkill.
> > > 
> > > I would suggest you try the former first and see if it already gives you
> > > the results you need. From your patch below, it seems it does. So
> > > perhaps just cleaning it up and turning it onto a proper patch would do
> > > the job for you.
> > 
> > Tests with my patch shows vastly improved responsivity but there is still some
> > way to go. Consider that an erase takes some 0.20-0.25 secs for us in best case in
> > which apps are blocked. We do get some complaints from the app taking too long to complete
> > some tasks during erase. 
> 
> Send some patches to fix this then :-)
> 

Could send these for 5.15.x/6.1.x but current release is different, still want those ?
I am heading out for the weekend so next week in that case.

    Jocke
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-19 13:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-15 15:55 spi-nor: excessive locking in spi_nor_erase() Joakim Tjernlund
2024-04-16 15:44 ` Pratyush Yadav
2024-04-16 18:03   ` Joakim Tjernlund
2024-04-17 12:42     ` Pratyush Yadav
2024-04-17 12:54       ` Joakim Tjernlund
2024-04-19 12:58         ` Pratyush Yadav
2024-04-19 13:14           ` Joakim Tjernlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.