All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM TOPIC][ATTEND]IOPS based ioscheduler
@ 2012-01-31  8:16 Shaohua Li
  2012-01-31 18:12 ` Jeff Moyer
  0 siblings, 1 reply; 4+ messages in thread
From: Shaohua Li @ 2012-01-31  8:16 UTC (permalink / raw)
  To: lsf-pc, linux-fsdevel, linux-scsi

Flash based storage has its characteristics. CFQ has some optimizations
for it, but not enough. The big problem is CFQ doesn't drive deep queue
depth, which causes poor performance in some workloads. CFQ also isn't
quite fair for fast storage (or further sacrifice of performance to get
fairness) because it uses time based accounting. This isn't good for
block cgroup. We need something different to make both performance and
fairness good.

A recent attempt is to use IOPS based ioscheduler for flash based
storage. It's expected to drive deep queue depth (so better performance)
and be more fairness (IOPS based accounting instead of time based).

I'd like to discuss:
 - Do we really need it? Or the question is if it is popular real
workloads drive deep io depth?
 - Should we have a separate ioscheduler for this or merge it to CFQ?
 - Other implementation discussions like differentiation of read/write
requests and request size. Flash based storage doesn't like rotate
storage, request cost of read/write and different request size usually
is different.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND]IOPS based ioscheduler
  2012-01-31  8:16 [LSF/MM TOPIC][ATTEND]IOPS based ioscheduler Shaohua Li
@ 2012-01-31 18:12 ` Jeff Moyer
  2012-02-01  7:03   ` Shaohua Li
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Moyer @ 2012-01-31 18:12 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lsf-pc, linux-fsdevel, linux-scsi

Shaohua Li <shaohua.li@intel.com> writes:

> Flash based storage has its characteristics. CFQ has some optimizations
> for it, but not enough. The big problem is CFQ doesn't drive deep queue
> depth, which causes poor performance in some workloads. CFQ also isn't
> quite fair for fast storage (or further sacrifice of performance to get
> fairness) because it uses time based accounting. This isn't good for
> block cgroup. We need something different to make both performance and
> fairness good.
>
> A recent attempt is to use IOPS based ioscheduler for flash based
> storage. It's expected to drive deep queue depth (so better performance)
> and be more fairness (IOPS based accounting instead of time based).
>
> I'd like to discuss:
>  - Do we really need it? Or the question is if it is popular real
> workloads drive deep io depth?
>  - Should we have a separate ioscheduler for this or merge it to CFQ?
>  - Other implementation discussions like differentiation of read/write
> requests and request size. Flash based storage doesn't like rotate
> storage, request cost of read/write and different request size usually
> is different.

I think you need to define a couple things to really gain traction.
First, what is the target?  Flash storage comes in many varieties, from
really poor performance to really, really fast.  Are you aiming to
address all of them?  If so, then let's see some numbers that prove that
you're basing your scheduling decisions on the right metrics for the
target storage device types.

Second, demonstrate how one workload can negatively affect another.  In
other words, justify the need for *any* I/O prioritization.  Building on
that, you'd have to show that you can't achieve your goals with existing
solutions, like deadline or noop with bandwidth control.  Proportional
weight I/O scheduling is often sub-optimal when the device is not kept
busy.  How will you address that?

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND]IOPS based ioscheduler
  2012-01-31 18:12 ` Jeff Moyer
@ 2012-02-01  7:03   ` Shaohua Li
  2012-02-01 18:54     ` [Lsf-pc] " Vivek Goyal
  0 siblings, 1 reply; 4+ messages in thread
From: Shaohua Li @ 2012-02-01  7:03 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: lsf-pc, linux-fsdevel, linux-scsi

On Tue, 2012-01-31 at 13:12 -0500, Jeff Moyer wrote:
> Shaohua Li <shaohua.li@intel.com> writes:
> 
> > Flash based storage has its characteristics. CFQ has some optimizations
> > for it, but not enough. The big problem is CFQ doesn't drive deep queue
> > depth, which causes poor performance in some workloads. CFQ also isn't
> > quite fair for fast storage (or further sacrifice of performance to get
> > fairness) because it uses time based accounting. This isn't good for
> > block cgroup. We need something different to make both performance and
> > fairness good.
> >
> > A recent attempt is to use IOPS based ioscheduler for flash based
> > storage. It's expected to drive deep queue depth (so better performance)
> > and be more fairness (IOPS based accounting instead of time based).
> >
> > I'd like to discuss:
> >  - Do we really need it? Or the question is if it is popular real
> > workloads drive deep io depth?
> >  - Should we have a separate ioscheduler for this or merge it to CFQ?
> >  - Other implementation discussions like differentiation of read/write
> > requests and request size. Flash based storage doesn't like rotate
> > storage, request cost of read/write and different request size usually
> > is different.
> 
> I think you need to define a couple things to really gain traction.
> First, what is the target?  Flash storage comes in many varieties, from
> really poor performance to really, really fast.  Are you aiming to
> address all of them?  If so, then let's see some numbers that prove that
> you're basing your scheduling decisions on the right metrics for the
> target storage device types.
For fast storage, like SSD or PCIe flash card.

> Second, demonstrate how one workload can negatively affect another.  In
> other words, justify the need for *any* I/O prioritization.  Building on
> that, you'd have to show that you can't achieve your goals with existing
> solutions, like deadline or noop with bandwidth control.
Basically some workloads with cgroup. bandwidth control doesn't cover
all requirements for cgroup users, that's why we have cgroup for CFQ
anyway.

>   Proportional
> weight I/O scheduling is often sub-optimal when the device is not kept
> busy.  How will you address that?
That's true. I choose better performance instead of better fairness if
device isn't busy. Fast flash storage is expensive, I thought
performance is more important in such case.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC][ATTEND]IOPS based ioscheduler
  2012-02-01  7:03   ` Shaohua Li
@ 2012-02-01 18:54     ` Vivek Goyal
  0 siblings, 0 replies; 4+ messages in thread
From: Vivek Goyal @ 2012-02-01 18:54 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Jeff Moyer, linux-fsdevel, lsf-pc, linux-scsi

On Wed, Feb 01, 2012 at 03:03:11PM +0800, Shaohua Li wrote:
> On Tue, 2012-01-31 at 13:12 -0500, Jeff Moyer wrote:
> > Shaohua Li <shaohua.li@intel.com> writes:
> > 
> > > Flash based storage has its characteristics. CFQ has some optimizations
> > > for it, but not enough. The big problem is CFQ doesn't drive deep queue
> > > depth, which causes poor performance in some workloads. CFQ also isn't
> > > quite fair for fast storage (or further sacrifice of performance to get
> > > fairness) because it uses time based accounting. This isn't good for
> > > block cgroup. We need something different to make both performance and
> > > fairness good.
> > >
> > > A recent attempt is to use IOPS based ioscheduler for flash based
> > > storage. It's expected to drive deep queue depth (so better performance)
> > > and be more fairness (IOPS based accounting instead of time based).
> > >
> > > I'd like to discuss:
> > >  - Do we really need it? Or the question is if it is popular real
> > > workloads drive deep io depth?
> > >  - Should we have a separate ioscheduler for this or merge it to CFQ?
> > >  - Other implementation discussions like differentiation of read/write
> > > requests and request size. Flash based storage doesn't like rotate
> > > storage, request cost of read/write and different request size usually
> > > is different.
> > 
> > I think you need to define a couple things to really gain traction.
> > First, what is the target?  Flash storage comes in many varieties, from
> > really poor performance to really, really fast.  Are you aiming to
> > address all of them?  If so, then let's see some numbers that prove that
> > you're basing your scheduling decisions on the right metrics for the
> > target storage device types.
> For fast storage, like SSD or PCIe flash card.

PCIe flash card can drive really deep queue depths to achieve optimal
performance. IIRC, we have driven queue depths of 512 or even more. If
that's the case, then threre might not be much point in IO scheduler
trying to provide per process fairness. Deadline doing batches of reads
and writes might be just enough.

> 
> > Second, demonstrate how one workload can negatively affect another.  In
> > other words, justify the need for *any* I/O prioritization.  Building on
> > that, you'd have to show that you can't achieve your goals with existing
> > solutions, like deadline or noop with bandwidth control.
> Basically some workloads with cgroup. bandwidth control doesn't cover
> all requirements for cgroup users, that's why we have cgroup for CFQ
> anyway.

What requirements are not covered? If you are just looking for fairness
among cgroups and CFQ already has iops mode for groups.

> 
> >   Proportional
> > weight I/O scheduling is often sub-optimal when the device is not kept
> > busy.  How will you address that?
> That's true. I choose better performance instead of better fairness if
> device isn't busy. Fast flash storage is expensive, I thought
> performance is more important in such case.

How do you decide whether drive is being utilized to the capacity? Looking
at queue depths itself is not sufficient. In flash based PCIe devices we
have noticed that driving deeper queue depths helped with throughput. So
just looking at random number of requests in flight to determine whether
drive is fully used or not is not a very good idea.

I agree with Jeff that we probably first need some real workload examples
and numbers to justify the need of an IOPS based scheduler. Once we are
convinced that we need it, discussion can go to next level where we
try to figure out whether we need to extend CFQ to handle that mode or
we need a new IO scheduler altoghether.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-02-01 18:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-31  8:16 [LSF/MM TOPIC][ATTEND]IOPS based ioscheduler Shaohua Li
2012-01-31 18:12 ` Jeff Moyer
2012-02-01  7:03   ` Shaohua Li
2012-02-01 18:54     ` [Lsf-pc] " Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.