All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lightnvm: pblk: prevent stall due to wb threshold
@ 2019-01-25 10:09 Javier González
  2019-01-25 13:08 ` Matias Bjørling
  0 siblings, 1 reply; 4+ messages in thread
From: Javier González @ 2019-01-25 10:09 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

In order to respect mw_cuinits, pblk's write buffer maintains a
backpointer to protect data not yet persisted; when writing to the write
buffer, this backpointer defines a threshold that pblk's rate-limiter
enforces.

On small PU configurations, the following scenarios might take place: (i)
the threshold is larger than the write buffer and (ii) the threshold is
smaller than the write buffer, but larger than the maximun allowed
split bio - 256KB at this moment (Note that writes are not always
split - we only do this when we the size of the buffer is smaller
than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
be written to the buffer, thus stalling.

This patch fixes the original backpointer implementation by considering
the threshold both on buffer creation and on the rate-limiters path,
when bio_split is triggered (case (ii) above).

Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
Signed-off-by: Javier González <javier@javigon.com>
---
 drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
 drivers/lightnvm/pblk-rl.c |  5 ++---
 drivers/lightnvm/pblk.h    |  2 +-
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
index d4ca8c64ee0f..a6133b50ed9c 100644
--- a/drivers/lightnvm/pblk-rb.c
+++ b/drivers/lightnvm/pblk-rb.c
@@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
 /*
  * pblk_rb_calculate_size -- calculate the size of the write buffer
  */
-static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
+static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
+					   unsigned int threshold)
 {
-	/* Alloc a write buffer that can at least fit 128 entries */
-	return (1 << max(get_count_order(nr_entries), 7));
+	unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
+	unsigned int max_sz = max(thr_sz, nr_entries);
+	unsigned int max_io;
+
+	/* Alloc a write buffer that can (i) fit at least two split bios
+	 * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
+	 * threshold will be respected
+	 */
+	max_io = (1 << max((int)(get_count_order(max_sz)),
+				(int)(get_count_order(NVM_MAX_VLBA << 1))));
+	if ((threshold + NVM_MAX_VLBA) >= max_io)
+		max_io <<= 1;
+
+	return max_io;
 }
 
 /*
@@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
 	unsigned int alloc_order, order, iter;
 	unsigned int nr_entries;
 
-	nr_entries = pblk_rb_calculate_size(size);
+	nr_entries = pblk_rb_calculate_size(size, threshold);
 	entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
 	if (!entries)
 		return -ENOMEM;
 
-	power_size = get_count_order(size);
+	power_size = get_count_order(nr_entries);
 	power_seg_sz = get_count_order(seg_size);
 
 	down_write(&pblk_rb_lock);
@@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
 	 * Initialize rate-limiter, which controls access to the write buffer
 	 * by user and GC I/O
 	 */
-	pblk_rl_init(&pblk->rl, rb->nr_entries);
+	pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
 
 	return 0;
 }
diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
index 76116d5f78e4..f81d8f0570ef 100644
--- a/drivers/lightnvm/pblk-rl.c
+++ b/drivers/lightnvm/pblk-rl.c
@@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
 	del_timer(&rl->u_timer);
 }
 
-void pblk_rl_init(struct pblk_rl *rl, int budget)
+void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
 {
 	struct pblk *pblk = container_of(rl, struct pblk, rl);
 	struct nvm_tgt_dev *dev = pblk->dev;
@@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
 	int sec_meta, blk_meta;
 	unsigned int rb_windows;
 
-
 	/* Consider sectors used for metadata */
 	sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
 	blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
@@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
 	/* To start with, all buffer is available to user I/O writers */
 	rl->rb_budget = budget;
 	rl->rb_user_max = budget;
-	rl->rb_max_io = budget >> 1;
+	rl->rb_max_io = (budget >> 1) - get_count_order(threshold);
 	rl->rb_gc_max = 0;
 	rl->rb_state = PBLK_RL_HIGH;
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 85e38ed62f85..752cd40e4ae6 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
 /*
  * pblk rate limiter
  */
-void pblk_rl_init(struct pblk_rl *rl, int budget);
+void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
 void pblk_rl_free(struct pblk_rl *rl);
 void pblk_rl_update_rates(struct pblk_rl *rl);
 int pblk_rl_high_thrs(struct pblk_rl *rl);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] lightnvm: pblk: prevent stall due to wb threshold
  2019-01-25 10:09 [PATCH] lightnvm: pblk: prevent stall due to wb threshold Javier González
@ 2019-01-25 13:08 ` Matias Bjørling
  2019-01-29 10:15   ` Hans Holmberg
  0 siblings, 1 reply; 4+ messages in thread
From: Matias Bjørling @ 2019-01-25 13:08 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Hans Holmberg

On 1/25/19 11:09 AM, Javier González wrote:
> In order to respect mw_cuinits, pblk's write buffer maintains a
> backpointer to protect data not yet persisted; when writing to the write
> buffer, this backpointer defines a threshold that pblk's rate-limiter
> enforces.
> 
> On small PU configurations, the following scenarios might take place: (i)
> the threshold is larger than the write buffer and (ii) the threshold is
> smaller than the write buffer, but larger than the maximun allowed
> split bio - 256KB at this moment (Note that writes are not always
> split - we only do this when we the size of the buffer is smaller
> than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
> be written to the buffer, thus stalling.
> 
> This patch fixes the original backpointer implementation by considering
> the threshold both on buffer creation and on the rate-limiters path,
> when bio_split is triggered (case (ii) above).
> 
> Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
> Signed-off-by: Javier González <javier@javigon.com>
> ---
>   drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
>   drivers/lightnvm/pblk-rl.c |  5 ++---
>   drivers/lightnvm/pblk.h    |  2 +-
>   3 files changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
> index d4ca8c64ee0f..a6133b50ed9c 100644
> --- a/drivers/lightnvm/pblk-rb.c
> +++ b/drivers/lightnvm/pblk-rb.c
> @@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
>   /*
>    * pblk_rb_calculate_size -- calculate the size of the write buffer
>    */
> -static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
> +static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
> +					   unsigned int threshold)
>   {
> -	/* Alloc a write buffer that can at least fit 128 entries */
> -	return (1 << max(get_count_order(nr_entries), 7));
> +	unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
> +	unsigned int max_sz = max(thr_sz, nr_entries);
> +	unsigned int max_io;
> +
> +	/* Alloc a write buffer that can (i) fit at least two split bios
> +	 * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
> +	 * threshold will be respected
> +	 */
> +	max_io = (1 << max((int)(get_count_order(max_sz)),
> +				(int)(get_count_order(NVM_MAX_VLBA << 1))));
> +	if ((threshold + NVM_MAX_VLBA) >= max_io)
> +		max_io <<= 1;
> +
> +	return max_io;
>   }
>   
>   /*
> @@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
>   	unsigned int alloc_order, order, iter;
>   	unsigned int nr_entries;
>   
> -	nr_entries = pblk_rb_calculate_size(size);
> +	nr_entries = pblk_rb_calculate_size(size, threshold);
>   	entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
>   	if (!entries)
>   		return -ENOMEM;
>   
> -	power_size = get_count_order(size);
> +	power_size = get_count_order(nr_entries);
>   	power_seg_sz = get_count_order(seg_size);
>   
>   	down_write(&pblk_rb_lock);
> @@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
>   	 * Initialize rate-limiter, which controls access to the write buffer
>   	 * by user and GC I/O
>   	 */
> -	pblk_rl_init(&pblk->rl, rb->nr_entries);
> +	pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
>   
>   	return 0;
>   }
> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
> index 76116d5f78e4..f81d8f0570ef 100644
> --- a/drivers/lightnvm/pblk-rl.c
> +++ b/drivers/lightnvm/pblk-rl.c
> @@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
>   	del_timer(&rl->u_timer);
>   }
>   
> -void pblk_rl_init(struct pblk_rl *rl, int budget)
> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
>   {
>   	struct pblk *pblk = container_of(rl, struct pblk, rl);
>   	struct nvm_tgt_dev *dev = pblk->dev;
> @@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
>   	int sec_meta, blk_meta;
>   	unsigned int rb_windows;
>   
> -
>   	/* Consider sectors used for metadata */
>   	sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
>   	blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
> @@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
>   	/* To start with, all buffer is available to user I/O writers */
>   	rl->rb_budget = budget;
>   	rl->rb_user_max = budget;
> -	rl->rb_max_io = budget >> 1;
> +	rl->rb_max_io = (budget >> 1) - get_count_order(threshold);
>   	rl->rb_gc_max = 0;
>   	rl->rb_state = PBLK_RL_HIGH;
>   
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 85e38ed62f85..752cd40e4ae6 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
>   /*
>    * pblk rate limiter
>    */
> -void pblk_rl_init(struct pblk_rl *rl, int budget);
> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
>   void pblk_rl_free(struct pblk_rl *rl);
>   void pblk_rl_update_rates(struct pblk_rl *rl);
>   int pblk_rl_high_thrs(struct pblk_rl *rl);
> 

+ Hans

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] lightnvm: pblk: prevent stall due to wb threshold
  2019-01-25 13:08 ` Matias Bjørling
@ 2019-01-29 10:15   ` Hans Holmberg
  2019-01-29 10:58     ` Javier González
  0 siblings, 1 reply; 4+ messages in thread
From: Hans Holmberg @ 2019-01-29 10:15 UTC (permalink / raw)
  To: Javier González, Matias Bjørling
  Cc: linux-block, Linux Kernel Mailing List, Hans Holmberg

On Fri, Jan 25, 2019 at 2:08 PM Matias Bjørling <mb@lightnvm.io> wrote:
>
> On 1/25/19 11:09 AM, Javier González wrote:
> > In order to respect mw_cuinits, pblk's write buffer maintains a
> > backpointer to protect data not yet persisted; when writing to the write
> > buffer, this backpointer defines a threshold that pblk's rate-limiter
> > enforces.
> >
> > On small PU configurations, the following scenarios might take place: (i)
> > the threshold is larger than the write buffer and (ii) the threshold is
> > smaller than the write buffer, but larger than the maximun allowed
> > split bio - 256KB at this moment (Note that writes are not always
> > split - we only do this when we the size of the buffer is smaller
> > than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
> > be written to the buffer, thus stalling.

I'll start a regression test run.
Do you have a way of reproducing these issues? It would be great to
add regression test(s) covering the two cases you list.

A general comment on the write buffer and rate limiter:  What is the
rationale of rounding up the write buffer size to a power of two?
We can end up allocating almost twice as much memory as we need, and
all this power-of-two-arithmetic makes the code hard to read.

> > This patch fixes the original backpointer implementation by considering
> > the threshold both on buffer creation and on the rate-limiters path,
> > when bio_split is triggered (case (ii) above).
> >
> > Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
> > Signed-off-by: Javier González <javier@javigon.com>
> > ---
> >   drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
> >   drivers/lightnvm/pblk-rl.c |  5 ++---
> >   drivers/lightnvm/pblk.h    |  2 +-
> >   3 files changed, 22 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
> > index d4ca8c64ee0f..a6133b50ed9c 100644
> > --- a/drivers/lightnvm/pblk-rb.c
> > +++ b/drivers/lightnvm/pblk-rb.c
> > @@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
> >   /*
> >    * pblk_rb_calculate_size -- calculate the size of the write buffer
> >    */
> > -static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
> > +static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
> > +                                        unsigned int threshold)
> >   {
> > -     /* Alloc a write buffer that can at least fit 128 entries */
> > -     return (1 << max(get_count_order(nr_entries), 7));
> > +     unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
> > +     unsigned int max_sz = max(thr_sz, nr_entries);
> > +     unsigned int max_io;
> > +
> > +     /* Alloc a write buffer that can (i) fit at least two split bios
> > +      * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
> > +      * threshold will be respected
> > +      */
> > +     max_io = (1 << max((int)(get_count_order(max_sz)),
> > +                             (int)(get_count_order(NVM_MAX_VLBA << 1))));
> > +     if ((threshold + NVM_MAX_VLBA) >= max_io)
> > +             max_io <<= 1;
> > +
> > +     return max_io;
> >   }
> >
> >   /*
> > @@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
> >       unsigned int alloc_order, order, iter;
> >       unsigned int nr_entries;
> >
> > -     nr_entries = pblk_rb_calculate_size(size);
> > +     nr_entries = pblk_rb_calculate_size(size, threshold);
> >       entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
> >       if (!entries)
> >               return -ENOMEM;
> >
> > -     power_size = get_count_order(size);
> > +     power_size = get_count_order(nr_entries);
> >       power_seg_sz = get_count_order(seg_size);
> >
> >       down_write(&pblk_rb_lock);
> > @@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
> >        * Initialize rate-limiter, which controls access to the write buffer
> >        * by user and GC I/O
> >        */
> > -     pblk_rl_init(&pblk->rl, rb->nr_entries);
> > +     pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
> >
> >       return 0;
> >   }
> > diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
> > index 76116d5f78e4..f81d8f0570ef 100644
> > --- a/drivers/lightnvm/pblk-rl.c
> > +++ b/drivers/lightnvm/pblk-rl.c
> > @@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
> >       del_timer(&rl->u_timer);
> >   }
> >
> > -void pblk_rl_init(struct pblk_rl *rl, int budget)
> > +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
> >   {
> >       struct pblk *pblk = container_of(rl, struct pblk, rl);
> >       struct nvm_tgt_dev *dev = pblk->dev;
> > @@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
> >       int sec_meta, blk_meta;
> >       unsigned int rb_windows;
> >
> > -
> >       /* Consider sectors used for metadata */
> >       sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
> >       blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
> > @@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
> >       /* To start with, all buffer is available to user I/O writers */
> >       rl->rb_budget = budget;
> >       rl->rb_user_max = budget;
> > -     rl->rb_max_io = budget >> 1;
> > +     rl->rb_max_io = (budget >> 1) - get_count_order(threshold);

Hmm? How can this work? get_count_order returns a bit index, and
budget is a size.

Also, what happens during garbage collection? Won't we risk stalling
anyway when GC consumes most of the budget?

For user writes to be inserted into the write buffer, they need to
pass the check pblk_rl_user_may_insert, which checks rb_user_max(which
is tuned depending on gc pressure) , not rb_max_io.

> >       rl->rb_gc_max = 0;
> >       rl->rb_state = PBLK_RL_HIGH;
> >
> > diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> > index 85e38ed62f85..752cd40e4ae6 100644
> > --- a/drivers/lightnvm/pblk.h
> > +++ b/drivers/lightnvm/pblk.h
> > @@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
> >   /*
> >    * pblk rate limiter
> >    */
> > -void pblk_rl_init(struct pblk_rl *rl, int budget);
> > +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
> >   void pblk_rl_free(struct pblk_rl *rl);
> >   void pblk_rl_update_rates(struct pblk_rl *rl);
> >   int pblk_rl_high_thrs(struct pblk_rl *rl);
> >
>
> + Hans

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] lightnvm: pblk: prevent stall due to wb threshold
  2019-01-29 10:15   ` Hans Holmberg
@ 2019-01-29 10:58     ` Javier González
  0 siblings, 0 replies; 4+ messages in thread
From: Javier González @ 2019-01-29 10:58 UTC (permalink / raw)
  To: Hans Holmberg
  Cc: Matias Bjørling, linux-block, Linux Kernel Mailing List,
	Hans Holmberg

[-- Attachment #1: Type: text/plain, Size: 8245 bytes --]


> On 29 Jan 2019, at 11.15, Hans Holmberg <hans.ml.holmberg@owltronix.com> wrote:
> 
> On Fri, Jan 25, 2019 at 2:08 PM Matias Bjørling <mb@lightnvm.io> wrote:
>> On 1/25/19 11:09 AM, Javier González wrote:
>>> In order to respect mw_cuinits, pblk's write buffer maintains a
>>> backpointer to protect data not yet persisted; when writing to the write
>>> buffer, this backpointer defines a threshold that pblk's rate-limiter
>>> enforces.
>>> 
>>> On small PU configurations, the following scenarios might take place: (i)
>>> the threshold is larger than the write buffer and (ii) the threshold is
>>> smaller than the write buffer, but larger than the maximun allowed
>>> split bio - 256KB at this moment (Note that writes are not always
>>> split - we only do this when we the size of the buffer is smaller
>>> than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
>>> be written to the buffer, thus stalling.
> 
> I'll start a regression test run.
> Do you have a way of reproducing these issues? It would be great to
> add regression test(s) covering the two cases you list.

It is easy to reproduce in qemu - simulate a drive with 8 PUs (leave the
rest as default) and then create a pblk instance using 1 or 2 LUNs. You
should see a stall without this patch.

We should add these corner cases to the suit - I was actually surprised
this had been out there for so long...

> A general comment on the write buffer and rate limiter:  What is the
> rationale of rounding up the write buffer size to a power of two?

It is much easier to manage as we can use the ring buffer helpers. There
is also dependencies on the map algorithm to this. If you want to give
it a try and relax this, it is OK by me. Anyway, this is orthogonal to
this fix.

> We can end up allocating almost twice as much memory as we need,

In the worse case this is true.

> and all this power-of-two-arithmetic makes the code hard to read.

Ring buffers and many other structures in the kernel are primarily
powers-of-2, this is not pblk specific.

> 
>>> This patch fixes the original backpointer implementation by considering
>>> the threshold both on buffer creation and on the rate-limiters path,
>>> when bio_split is triggered (case (ii) above).
>>> 
>>> Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
>>> Signed-off-by: Javier González <javier@javigon.com>
>>> ---
>>>  drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
>>>  drivers/lightnvm/pblk-rl.c |  5 ++---
>>>  drivers/lightnvm/pblk.h    |  2 +-
>>>  3 files changed, 22 insertions(+), 10 deletions(-)
>>> 
>>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
>>> index d4ca8c64ee0f..a6133b50ed9c 100644
>>> --- a/drivers/lightnvm/pblk-rb.c
>>> +++ b/drivers/lightnvm/pblk-rb.c
>>> @@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
>>>  /*
>>>   * pblk_rb_calculate_size -- calculate the size of the write buffer
>>>   */
>>> -static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
>>> +static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
>>> +                                        unsigned int threshold)
>>>  {
>>> -     /* Alloc a write buffer that can at least fit 128 entries */
>>> -     return (1 << max(get_count_order(nr_entries), 7));
>>> +     unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
>>> +     unsigned int max_sz = max(thr_sz, nr_entries);
>>> +     unsigned int max_io;
>>> +
>>> +     /* Alloc a write buffer that can (i) fit at least two split bios
>>> +      * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
>>> +      * threshold will be respected
>>> +      */
>>> +     max_io = (1 << max((int)(get_count_order(max_sz)),
>>> +                             (int)(get_count_order(NVM_MAX_VLBA << 1))));
>>> +     if ((threshold + NVM_MAX_VLBA) >= max_io)
>>> +             max_io <<= 1;
>>> +
>>> +     return max_io;
>>>  }
>>> 
>>>  /*
>>> @@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
>>>      unsigned int alloc_order, order, iter;
>>>      unsigned int nr_entries;
>>> 
>>> -     nr_entries = pblk_rb_calculate_size(size);
>>> +     nr_entries = pblk_rb_calculate_size(size, threshold);
>>>      entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
>>>      if (!entries)
>>>              return -ENOMEM;
>>> 
>>> -     power_size = get_count_order(size);
>>> +     power_size = get_count_order(nr_entries);
>>>      power_seg_sz = get_count_order(seg_size);
>>> 
>>>      down_write(&pblk_rb_lock);
>>> @@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
>>>       * Initialize rate-limiter, which controls access to the write buffer
>>>       * by user and GC I/O
>>>       */
>>> -     pblk_rl_init(&pblk->rl, rb->nr_entries);
>>> +     pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
>>> 
>>>      return 0;
>>>  }
>>> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
>>> index 76116d5f78e4..f81d8f0570ef 100644
>>> --- a/drivers/lightnvm/pblk-rl.c
>>> +++ b/drivers/lightnvm/pblk-rl.c
>>> @@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
>>>      del_timer(&rl->u_timer);
>>>  }
>>> 
>>> -void pblk_rl_init(struct pblk_rl *rl, int budget)
>>> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
>>>  {
>>>      struct pblk *pblk = container_of(rl, struct pblk, rl);
>>>      struct nvm_tgt_dev *dev = pblk->dev;
>>> @@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
>>>      int sec_meta, blk_meta;
>>>      unsigned int rb_windows;
>>> 
>>> -
>>>      /* Consider sectors used for metadata */
>>>      sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
>>>      blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
>>> @@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
>>>      /* To start with, all buffer is available to user I/O writers */
>>>      rl->rb_budget = budget;
>>>      rl->rb_user_max = budget;
>>> -     rl->rb_max_io = budget >> 1;
>>> +     rl->rb_max_io = (budget >> 1) - get_count_order(threshold);
> 
> Hmm? How can this work? get_count_order returns a bit index, and
> budget is a size.
> 

You are right. I was playing with passing the order and forgot to remove
this. It should be
   rl-rb_max_io = budget - threshold;

Note that in the original code we allow for 2 I/Os to come in
simultaneously by dividing by 2 the max_io, but this was a silly
"optimization" that we can remove as it does not do anything.

I will fix it in V2.

> Also, what happens during garbage collection? Won't we risk stalling
> anyway when GC consumes most of the budget?
> 

This only affects the max_io size, which is shared by user and GC I/O. I
would not expect any effect on the budget by this change.

> For user writes to be inserted into the write buffer, they need to
> pass the check pblk_rl_user_may_insert, which checks rb_user_max(which
> is tuned depending on gc pressure) , not rb_max_io.

There is no tuning, just a check that there is space. Now we limit the
I/O size, which only means that the write needs to wait until the
previous I/Os have completed.

I am happy to be proven wrong if you find a regression on the tests, but
I cannot see any at this point neither on the code nor on my tests.

> 
>>>      rl->rb_gc_max = 0;
>>>      rl->rb_state = PBLK_RL_HIGH;
>>> 
>>> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
>>> index 85e38ed62f85..752cd40e4ae6 100644
>>> --- a/drivers/lightnvm/pblk.h
>>> +++ b/drivers/lightnvm/pblk.h
>>> @@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
>>>  /*
>>>   * pblk rate limiter
>>>   */
>>> -void pblk_rl_init(struct pblk_rl *rl, int budget);
>>> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
>>>  void pblk_rl_free(struct pblk_rl *rl);
>>>  void pblk_rl_update_rates(struct pblk_rl *rl);
>>>  int pblk_rl_high_thrs(struct pblk_rl *rl);
>> 
>> + Hans

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-29 10:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-25 10:09 [PATCH] lightnvm: pblk: prevent stall due to wb threshold Javier González
2019-01-25 13:08 ` Matias Bjørling
2019-01-29 10:15   ` Hans Holmberg
2019-01-29 10:58     ` Javier González

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.