All of lore.kernel.org
 help / color / mirror / Atom feed
* Reasoning of exposing queue/rotational=0
@ 2017-05-04 21:24 Kai Krakow
  2017-05-05 16:11 ` Coly Li
  0 siblings, 1 reply; 10+ messages in thread
From: Kai Krakow @ 2017-05-04 21:24 UTC (permalink / raw)
  To: linux-bcache

Hello!

What's the reasoning for exposing bcache devices as being
non-rotational? Currently, it fools btrfs into using ssd allocation
scheme on the underlying harddisks which isn't really what I expected
to get. So I used a udev rule to change this:

ACTION=="add|change", KERNEL=="bcache*", ATTR{queue/rotational}="1"

Wouldn't it make more sense to set this to the same value as the
underlying backing device by default?

Because in reality, the bcache is still what the backing device is: A
rotational medium. A cache doesn't make this non-rotational.

Thoughts?

-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-04 21:24 Reasoning of exposing queue/rotational=0 Kai Krakow
@ 2017-05-05 16:11 ` Coly Li
  2017-05-05 17:44   ` Vojtech Pavlik
  2017-05-05 18:04   ` Kai Krakow
  0 siblings, 2 replies; 10+ messages in thread
From: Coly Li @ 2017-05-05 16:11 UTC (permalink / raw)
  To: Kai Krakow, linux-bcache

On 2017/5/5 上午5:24, Kai Krakow wrote:
> Hello!
> 
> What's the reasoning for exposing bcache devices as being
> non-rotational? Currently, it fools btrfs into using ssd allocation
> scheme on the underlying harddisks which isn't really what I expected
> to get. So I used a udev rule to change this:
> 
> ACTION=="add|change", KERNEL=="bcache*", ATTR{queue/rotational}="1"
> 
> Wouldn't it make more sense to set this to the same value as the
> underlying backing device by default?
> 
> Because in reality, the bcache is still what the backing device is: A
> rotational medium. A cache doesn't make this non-rotational.
> 
> Thoughts?

It depends on hit ration. If a non-rotational device used as cache, and
hit ration is high enough, the cached device just responses as
non-rotational device.

But yes, I feel your opinion makes sense, in the btrfs case. How about a
policy like this:


cache-device-rotational   backing-device-rotational   export-rotational
         Y                            Y                      Y
         Y                            N                      N
         N                            Y                      N
         N                            N                      N

That is, a bcache device is exposed as non-rotational device only when
all devices of cache devices and backing devices are all rotational.

Thanks.

-- 
Coly Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 16:11 ` Coly Li
@ 2017-05-05 17:44   ` Vojtech Pavlik
  2017-05-05 18:23     ` Kai Krakow
  2017-05-05 19:01     ` Kai Krakow
  2017-05-05 18:04   ` Kai Krakow
  1 sibling, 2 replies; 10+ messages in thread
From: Vojtech Pavlik @ 2017-05-05 17:44 UTC (permalink / raw)
  To: Coly Li; +Cc: Kai Krakow, linux-bcache

On Sat, May 06, 2017 at 12:11:13AM +0800, Coly Li wrote:
> On 2017/5/5 上午5:24, Kai Krakow wrote:
> > Hello!
> > 
> > What's the reasoning for exposing bcache devices as being
> > non-rotational? Currently, it fools btrfs into using ssd allocation
> > scheme on the underlying harddisks which isn't really what I expected
> > to get. So I used a udev rule to change this:
> > 
> > ACTION=="add|change", KERNEL=="bcache*", ATTR{queue/rotational}="1"
> > 
> > Wouldn't it make more sense to set this to the same value as the
> > underlying backing device by default?
> > 
> > Because in reality, the bcache is still what the backing device is: A
> > rotational medium. A cache doesn't make this non-rotational.
> > 
> > Thoughts?
> 
> It depends on hit ration. If a non-rotational device used as cache, and
> hit ration is high enough, the cached device just responses as
> non-rotational device.
> 
> But yes, I feel your opinion makes sense, in the btrfs case. How about a
> policy like this:
> 
> 
> cache-device-rotational   backing-device-rotational   export-rotational
>          Y                            Y                      Y
>          Y                            N                      N
>          N                            Y                      N
>          N                            N                      N
> 
> That is, a bcache device is exposed as non-rotational device only when
> all devices of cache devices and backing devices are all rotational.

I don't think that makes much sense either - the cache device will not
be used in the pattern that the exposed bcache device is, so any choice
of access patterns by a higher level based on rotational/non-rotational
will be messed up anyway.

I think the current behavior (rotational=0) is correct in most cases.

-- 
Vojtech Pavlik
Director SuSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 16:11 ` Coly Li
  2017-05-05 17:44   ` Vojtech Pavlik
@ 2017-05-05 18:04   ` Kai Krakow
  1 sibling, 0 replies; 10+ messages in thread
From: Kai Krakow @ 2017-05-05 18:04 UTC (permalink / raw)
  To: linux-bcache

Am Sat, 6 May 2017 00:11:13 +0800
schrieb Coly Li <i@coly.li>:

> On 2017/5/5 上午5:24, Kai Krakow wrote:
> > Hello!
> > 
> > What's the reasoning for exposing bcache devices as being
> > non-rotational? Currently, it fools btrfs into using ssd allocation
> > scheme on the underlying harddisks which isn't really what I
> > expected to get. So I used a udev rule to change this:
> > 
> > ACTION=="add|change", KERNEL=="bcache*", ATTR{queue/rotational}="1"
> > 
> > Wouldn't it make more sense to set this to the same value as the
> > underlying backing device by default?
> > 
> > Because in reality, the bcache is still what the backing device is:
> > A rotational medium. A cache doesn't make this non-rotational.
> > 
> > Thoughts?  
> 
> It depends on hit ration. If a non-rotational device used as cache,
> and hit ration is high enough, the cached device just responses as
> non-rotational device.
> 
> But yes, I feel your opinion makes sense, in the btrfs case. How
> about a policy like this:
> 
> 
> cache-device-rotational   backing-device-rotational export-rotational
> Y                            Y                      Y
> Y                            N                      N
> N                            Y                      N
> N                            N                      N

This probably makes most sense, although it won't fix my particular
situation... Because I have:

 cdev    bdev    bcache
  N   &&  Y   ==  N
 
But I'd like to have bcache == Y

Hit rate is around 70-85% for me (500GB cache on 2TB data). So your
particular reasoning makes sense, too: 80% of accesses hit the cache
which makes it behave like non-rotational in 80% of all accesses.

But the bcache device itself is only a transition layer, especially we
cannot set any IO scheduler for it, this is left to the lower layers.
And these correctly expose the rotational flag, and that is where I set
deadline for SSD, and cfq for HDD. I also experimented with slice_idle
= 0 on SSD with cfq but deadline gave better results.

Given that, what could the rotational flag also be used for? Currently
it's used by btrfs to select an allocation scheme. I can imagine that
other filesystems do that, too. Does the kernel depend any decision on
this flag? Or anything else other then allocation decision?

Given the case of allocation decision: It makes no sense to pretend SSD
allocation through bcache as bcache block allocation is translated to
the real device and has nothing to do with the actual physical layout
of the backing device. So why pretend it is non-rotational?

Also think of discarding the cache: Now it would be clearly rotational
until cache hit rate builds up again.

Also I don't think applications should mis-interpret the bcache as
non-rotational to optimize workloads for it, because bcache is a
caching layer. It operates exactly for the purpose of optimizing those
workloads itself. Doing otherwise could work against what bcache tries
to achieve, e.g. doing lots of random IO because we pretend to be
non-rotational would push my precious cache data out of the cache for
no reason. Bcache is there to turn random IO into sequential IO - but
not for the sake of "because it can". Applications should still
optimize for rotational media even when running through bcache.

Without further clues it makes most sense to me to set
bcache.rotational = bdev.rotational.

I almost think that bcache does not explicitly set this flag, so it
stays 0. I think the same applies to iscsi and other network block
devices which pretend to be also non-rotational although in reality
they probably aren't. Only, they should probably explicitly not use an
IO scheduler as that is best left to the host system - as in virtual
guests, and as with enterprise RAID controllers, which do their own IO
scheduling. But "rotational" is totally not a decision we would
automatically select a default IO scheduler by. This should be left to
layers that more exactly know what a device is, e.g. udev or the
administrator.

> That is, a bcache device is exposed as non-rotational device only when
> all devices of cache devices and backing devices are all rotational.

I didn't really get that sentence... Either appearance of rotational
seems to be wrong in your sentence. ;-)


-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 17:44   ` Vojtech Pavlik
@ 2017-05-05 18:23     ` Kai Krakow
  2017-05-05 19:02       ` Vojtech Pavlik
  2017-05-05 19:01     ` Kai Krakow
  1 sibling, 1 reply; 10+ messages in thread
From: Kai Krakow @ 2017-05-05 18:23 UTC (permalink / raw)
  To: linux-bcache

Am Fri, 5 May 2017 19:44:39 +0200
schrieb Vojtech Pavlik <vojtech@suse.com>:

> On Sat, May 06, 2017 at 12:11:13AM +0800, Coly Li wrote:
> > On 2017/5/5 上午5:24, Kai Krakow wrote:  
> > > Hello!
> > > 
> > > What's the reasoning for exposing bcache devices as being
> > > non-rotational? Currently, it fools btrfs into using ssd
> > > allocation scheme on the underlying harddisks which isn't really
> > > what I expected to get. So I used a udev rule to change this:
> > > 
> > > ACTION=="add|change", KERNEL=="bcache*",
> > > ATTR{queue/rotational}="1"
> > > 
> > > Wouldn't it make more sense to set this to the same value as the
> > > underlying backing device by default?
> > > 
> > > Because in reality, the bcache is still what the backing device
> > > is: A rotational medium. A cache doesn't make this non-rotational.
> > > 
> > > Thoughts?  
> > 
> > It depends on hit ration. If a non-rotational device used as cache,
> > and hit ration is high enough, the cached device just responses as
> > non-rotational device.
> > 
> > But yes, I feel your opinion makes sense, in the btrfs case. How
> > about a policy like this:
> > 
> > 
> > cache-device-rotational   backing-device-rotational
> > export-rotational Y
> > Y                      Y Y
> > N                      N N
> > Y                      N N
> > N                      N
> > 
> > That is, a bcache device is exposed as non-rotational device only
> > when all devices of cache devices and backing devices are all
> > rotational.  
> 
> I don't think that makes much sense either - the cache device will not
> be used in the pattern that the exposed bcache device is, so any
> choice of access patterns by a higher level based on
> rotational/non-rotational will be messed up anyway.
> 
> I think the current behavior (rotational=0) is correct in most cases.

Well, I don't want to do bikeshedding... But both didn't answer my
original question of what's the reasoning. Did anyone put thoughts into
this? Was it arbitrarily chosen? Is rotational=0 just a default that
bcache didn't bother to explicitly set?

Answering the last two questions with "yes" would suggest that it should
be rethought...

Answering the first with "yes" means I'd like to know more. ;-)


-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 17:44   ` Vojtech Pavlik
  2017-05-05 18:23     ` Kai Krakow
@ 2017-05-05 19:01     ` Kai Krakow
  1 sibling, 0 replies; 10+ messages in thread
From: Kai Krakow @ 2017-05-05 19:01 UTC (permalink / raw)
  To: linux-bcache

Am Fri, 5 May 2017 19:44:39 +0200
schrieb Vojtech Pavlik <vojtech@suse.com>:

> On Sat, May 06, 2017 at 12:11:13AM +0800, Coly Li wrote:
> > On 2017/5/5 上午5:24, Kai Krakow wrote:  
> > > Hello!
> > > 
> > > What's the reasoning for exposing bcache devices as being
> > > non-rotational? Currently, it fools btrfs into using ssd
> > > allocation scheme on the underlying harddisks which isn't really
> > > what I expected to get. So I used a udev rule to change this:
> > > 
> > > ACTION=="add|change", KERNEL=="bcache*",
> > > ATTR{queue/rotational}="1"
> > > 
> > > Wouldn't it make more sense to set this to the same value as the
> > > underlying backing device by default?
> > > 
> > > Because in reality, the bcache is still what the backing device
> > > is: A rotational medium. A cache doesn't make this non-rotational.
> > > 
> > > Thoughts?  
> > 
> > It depends on hit ration. If a non-rotational device used as cache,
> > and hit ration is high enough, the cached device just responses as
> > non-rotational device.
> > 
> > But yes, I feel your opinion makes sense, in the btrfs case. How
> > about a policy like this:
> > 
> > 
> > cache-device-rotational   backing-device-rotational
> > export-rotational Y
> > Y                      Y Y
> > N                      N N
> > Y                      N N
> > N                      N
> > 
> > That is, a bcache device is exposed as non-rotational device only
> > when all devices of cache devices and backing devices are all
> > rotational.  
> 
> I don't think that makes much sense either - the cache device will not
> be used in the pattern that the exposed bcache device is, so any
> choice of access patterns by a higher level based on
> rotational/non-rotational will be messed up anyway.

BTW: Exactly that would be the reasoning for me to not set it
statically to 0, but instead to the value of the backing device. For
example, turning it from 1 into 0 up the layers already messes up with
decisions btrfs takes.

In the end, bcache doesn't magically turn my storage into
non-rotational. It is more about turning random IO into sequential IO.
That you get higher throughput also and almost 0 seek time for cache
hits, is just a by-product (tho, a very welcome one).

Given the case that write-caching is set to write-around, or
write-through, your application would still see rotational behavior but
the flag tells it "non-rotational". That seems wrong. Only write-back
caching gives you non-rotational write behavior as seen from the
application. And when bcache passes the sequential cutoff, it doesn't
matter anyway. But now a wrong assumption about revolution can come
into play: An application could try to do sequential IO if seeing
rotational media - but now it doesn't care: It will waste wear-leveling
and discard data that really should belong into the cache.

And when reading, it behaves more like a big, permanent block cache: If
the cache is hit, that's more comparable to a hit in the page/block
cache of the kernel. If it's a miss, it still looks like rotational
access to the application.

So what's the deal?

> I think the current behavior (rotational=0) is correct in most cases.

Currently I don't see why. What defines "most cases"?


-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 18:23     ` Kai Krakow
@ 2017-05-05 19:02       ` Vojtech Pavlik
  2017-05-05 19:14         ` Kai Krakow
  0 siblings, 1 reply; 10+ messages in thread
From: Vojtech Pavlik @ 2017-05-05 19:02 UTC (permalink / raw)
  To: Kai Krakow; +Cc: linux-bcache

On Fri, May 05, 2017 at 08:23:17PM +0200, Kai Krakow wrote:
> > I don't think that makes much sense either - the cache device will not
> > be used in the pattern that the exposed bcache device is, so any
> > choice of access patterns by a higher level based on
> > rotational/non-rotational will be messed up anyway.
> > 
> > I think the current behavior (rotational=0) is correct in most cases.
> 
> Well, I don't want to do bikeshedding... But both didn't answer my
> original question of what's the reasoning. Did anyone put thoughts into
> this? 

Originally, rotational=1 is just a flag coming from the
IDE/SCSI/SATA/etc. layers to the OS telling it whether the device is
spinning or not. Without any specific implications as to the behavior of
the device.

It is writable for a reason - not even all flash based devices report
the flag correctly at the hardware level.

Linux uses the flag on the block device (queue) to tell whether seeks
are very expensive compared to linear reads and whether it makes sense
to spend large amounts CPU cycles and memory on reordering.

Btrfs is one user that tries to change the allocation policy and thus
the likelihood of fragmentation and/or long seeks based on whether the
device reports 'rotational'.

However, it actually has three modes at the fs level: 'nossd',
'ssd' and 'ssd_spread', with the last being faster on cheaper SSDs.
There are large differences even between individual SSD profiles. Again,
for a good reason, btrfs has these as mount options that override any
'rotational' hint.

All in all, if you want all the performance available, you need to see
what works best for your workload.

The same applies to i/o schedulers. They're much less dependent on the
underlying device than the workload put on them.

This is not the first time the question comes up.

> Was it arbitrarily chosen? Is rotational=0 just a default that
> bcache didn't bother to explicitly set?

A bcache device performance profile is neither one of a rotational
device, nor one of a SSD.

Sequential reads may be bypassed or not. If not, some parts of it may
be cached, in which case there will be seeks on the backing device even
when there should be none on a real rotational device.

Random reads may be fast if they're hitting cached locations.

Random and sequential writes will be always cached if writeback is
enabled and so there is no point in spending CPU cycles on optimizing
writes.

How much the bcache device will behave like the backing device and how
much like the caching device does depend mainly on the workload and the
size of its working set compared to the size of the cache.

I do not believe that the choice of rotational=0 was arbitrary or a
default. It's simply that bcache changes the access pattern to both the
caching and backing device so much that it no longer resembles a
rotational device's performance profile in any case.

> Answering the last two questions with "yes" would suggest that it should
> be rethought...
> 
> Answering the first with "yes" means I'd like to know more. ;-)

-- 
Vojtech Pavlik
Director SuSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 19:02       ` Vojtech Pavlik
@ 2017-05-05 19:14         ` Kai Krakow
  2017-05-09 18:11           ` Eric Wheeler
  0 siblings, 1 reply; 10+ messages in thread
From: Kai Krakow @ 2017-05-05 19:14 UTC (permalink / raw)
  To: linux-bcache

Am Fri, 5 May 2017 21:02:31 +0200
schrieb Vojtech Pavlik <vojtech@suse.com>:

> On Fri, May 05, 2017 at 08:23:17PM +0200, Kai Krakow wrote:
> > > I don't think that makes much sense either - the cache device
> > > will not be used in the pattern that the exposed bcache device
> > > is, so any choice of access patterns by a higher level based on
> > > rotational/non-rotational will be messed up anyway.
> > > 
> > > I think the current behavior (rotational=0) is correct in most
> > > cases.  
> > 
> > Well, I don't want to do bikeshedding... But both didn't answer my
> > original question of what's the reasoning. Did anyone put thoughts
> > into this?   
> 
> Originally, rotational=1 is just a flag coming from the
> IDE/SCSI/SATA/etc. layers to the OS telling it whether the device is
> spinning or not. Without any specific implications as to the behavior
> of the device.
> 
> It is writable for a reason - not even all flash based devices report
> the flag correctly at the hardware level.
> 
> Linux uses the flag on the block device (queue) to tell whether seeks
> are very expensive compared to linear reads and whether it makes sense
> to spend large amounts CPU cycles and memory on reordering.
> 
> Btrfs is one user that tries to change the allocation policy and thus
> the likelihood of fragmentation and/or long seeks based on whether the
> device reports 'rotational'.
> 
> However, it actually has three modes at the fs level: 'nossd',
> 'ssd' and 'ssd_spread', with the last being faster on cheaper SSDs.
> There are large differences even between individual SSD profiles.
> Again, for a good reason, btrfs has these as mount options that
> override any 'rotational' hint.
> 
> All in all, if you want all the performance available, you need to see
> what works best for your workload.
> 
> The same applies to i/o schedulers. They're much less dependent on the
> underlying device than the workload put on them.
> 
> This is not the first time the question comes up.

I tried to look up information about it previously but didn't came up
with useful results.

> > Was it arbitrarily chosen? Is rotational=0 just a default that
> > bcache didn't bother to explicitly set?  
> 
> A bcache device performance profile is neither one of a rotational
> device, nor one of a SSD.
> 
> Sequential reads may be bypassed or not. If not, some parts of it may
> be cached, in which case there will be seeks on the backing device
> even when there should be none on a real rotational device.
> 
> Random reads may be fast if they're hitting cached locations.
> 
> Random and sequential writes will be always cached if writeback is
> enabled and so there is no point in spending CPU cycles on optimizing
> writes.
> 
> How much the bcache device will behave like the backing device and how
> much like the caching device does depend mainly on the workload and
> the size of its working set compared to the size of the cache.
> 
> I do not believe that the choice of rotational=0 was arbitrary or a
> default. It's simply that bcache changes the access pattern to both
> the caching and backing device so much that it no longer resembles a
> rotational device's performance profile in any case.
> 
> > Answering the last two questions with "yes" would suggest that it
> > should be rethought...
> > 
> > Answering the first with "yes" means I'd like to know more. ;-)  

Okay, that answers my questions. Thanks. :-)

But that only tells me that a "default" cannot be really chosen. Both
make sense.

I wonder if Linux chose to call the flag "non_rotational", would it
also default to 0 in bcache? I think nobody would know. ;-)

For me it looks like sticking that to rotational=1 gives overall better
long-time performance and btrfs filesystem layout.

Anyone who stumbles across this should judge on his own based on
Vojtech's good answer.


-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-05 19:14         ` Kai Krakow
@ 2017-05-09 18:11           ` Eric Wheeler
  2017-05-10 20:18             ` Kai Krakow
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wheeler @ 2017-05-09 18:11 UTC (permalink / raw)
  To: Kai Krakow; +Cc: linux-bcache

On Fri, 5 May 2017, Kai Krakow wrote:

> Am Fri, 5 May 2017 21:02:31 +0200
> schrieb Vojtech Pavlik <vojtech@suse.com>:
> 
> > On Fri, May 05, 2017 at 08:23:17PM +0200, Kai Krakow wrote:
> > > > I don't think that makes much sense either - the cache device
> > > > will not be used in the pattern that the exposed bcache device
> > > > is, so any choice of access patterns by a higher level based on
> > > > rotational/non-rotational will be messed up anyway.
> > > > 
> > > > I think the current behavior (rotational=0) is correct in most
> > > > cases.  
> > > 
> > > Well, I don't want to do bikeshedding... But both didn't answer my
> > > original question of what's the reasoning. Did anyone put thoughts
> > > into this?   
> > 
> > Originally, rotational=1 is just a flag coming from the
> > IDE/SCSI/SATA/etc. layers to the OS telling it whether the device is
> > spinning or not. Without any specific implications as to the behavior
> > of the device.
> > 
> > It is writable for a reason - not even all flash based devices report
> > the flag correctly at the hardware level.
> > 
> > Linux uses the flag on the block device (queue) to tell whether seeks
> > are very expensive compared to linear reads and whether it makes sense
> > to spend large amounts CPU cycles and memory on reordering.
> > 
> > Btrfs is one user that tries to change the allocation policy and thus
> > the likelihood of fragmentation and/or long seeks based on whether the
> > device reports 'rotational'.
> > 
> > However, it actually has three modes at the fs level: 'nossd',
> > 'ssd' and 'ssd_spread', with the last being faster on cheaper SSDs.
> > There are large differences even between individual SSD profiles.
> > Again, for a good reason, btrfs has these as mount options that
> > override any 'rotational' hint.
> > 
> > All in all, if you want all the performance available, you need to see
> > what works best for your workload.
> > 
> > The same applies to i/o schedulers. They're much less dependent on the
> > underlying device than the workload put on them.
> > 
> > This is not the first time the question comes up.
> 
> I tried to look up information about it previously but didn't came up
> with useful results.
> 
> > > Was it arbitrarily chosen? Is rotational=0 just a default that
> > > bcache didn't bother to explicitly set?  
> > 
> > A bcache device performance profile is neither one of a rotational
> > device, nor one of a SSD.
> > 
> > Sequential reads may be bypassed or not. If not, some parts of it may
> > be cached, in which case there will be seeks on the backing device
> > even when there should be none on a real rotational device.
> > 
> > Random reads may be fast if they're hitting cached locations.
> > 
> > Random and sequential writes will be always cached if writeback is
> > enabled and so there is no point in spending CPU cycles on optimizing
> > writes.
> > 
> > How much the bcache device will behave like the backing device and how
> > much like the caching device does depend mainly on the workload and
> > the size of its working set compared to the size of the cache.
> > 
> > I do not believe that the choice of rotational=0 was arbitrary or a
> > default. It's simply that bcache changes the access pattern to both
> > the caching and backing device so much that it no longer resembles a
> > rotational device's performance profile in any case.
> > 
> > > Answering the last two questions with "yes" would suggest that it
> > > should be rethought...
> > > 
> > > Answering the first with "yes" means I'd like to know more. ;-)  
> 
> Okay, that answers my questions. Thanks. :-)
> 
> But that only tells me that a "default" cannot be really chosen. Both
> make sense.
> 
> I wonder if Linux chose to call the flag "non_rotational", would it
> also default to 0 in bcache? I think nobody would know. ;-)
> 
> For me it looks like sticking that to rotational=1 gives overall better
> long-time performance and btrfs filesystem layout.
> 
> Anyone who stumbles across this should judge on his own based on
> Vojtech's good answer.

Indeed!

Also note:

# cat /sys/block/bcache0/queue/scheduler 
none

There is no scheduler for bcache, so the bio's pass through whatever your 
backing (cache) device uses as a queue scheduler, which could differ 
between cache/backing.  If you use hardware RAID, your 'rotational' flag 
is probably wrong for SSDs so set it on boot somehow (udev, etc.)

--
Eric Wheeler



> 
> 
> -- 
> Regards,
> Kai
> 
> Replies to list-only preferred.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Reasoning of exposing queue/rotational=0
  2017-05-09 18:11           ` Eric Wheeler
@ 2017-05-10 20:18             ` Kai Krakow
  0 siblings, 0 replies; 10+ messages in thread
From: Kai Krakow @ 2017-05-10 20:18 UTC (permalink / raw)
  To: linux-bcache

Am Tue, 9 May 2017 18:11:06 +0000 (UTC)
schrieb Eric Wheeler <bcache@lists.ewheeler.net>:

> On Fri, 5 May 2017, Kai Krakow wrote:
> 
> > Am Fri, 5 May 2017 21:02:31 +0200
> > schrieb Vojtech Pavlik <vojtech@suse.com>:
> >   
> > > On Fri, May 05, 2017 at 08:23:17PM +0200, Kai Krakow wrote:  
>  [...]  
>  [...]  
> > > 
> > > Originally, rotational=1 is just a flag coming from the
> > > IDE/SCSI/SATA/etc. layers to the OS telling it whether the device
> > > is spinning or not. Without any specific implications as to the
> > > behavior of the device.
> > > 
> > > It is writable for a reason - not even all flash based devices
> > > report the flag correctly at the hardware level.
> > > 
> > > Linux uses the flag on the block device (queue) to tell whether
> > > seeks are very expensive compared to linear reads and whether it
> > > makes sense to spend large amounts CPU cycles and memory on
> > > reordering.
> > > 
> > > Btrfs is one user that tries to change the allocation policy and
> > > thus the likelihood of fragmentation and/or long seeks based on
> > > whether the device reports 'rotational'.
> > > 
> > > However, it actually has three modes at the fs level: 'nossd',
> > > 'ssd' and 'ssd_spread', with the last being faster on cheaper
> > > SSDs. There are large differences even between individual SSD
> > > profiles. Again, for a good reason, btrfs has these as mount
> > > options that override any 'rotational' hint.
> > > 
> > > All in all, if you want all the performance available, you need
> > > to see what works best for your workload.
> > > 
> > > The same applies to i/o schedulers. They're much less dependent
> > > on the underlying device than the workload put on them.
> > > 
> > > This is not the first time the question comes up.  
> > 
> > I tried to look up information about it previously but didn't came
> > up with useful results.
> >   
>  [...]  
> > > 
> > > A bcache device performance profile is neither one of a rotational
> > > device, nor one of a SSD.
> > > 
> > > Sequential reads may be bypassed or not. If not, some parts of it
> > > may be cached, in which case there will be seeks on the backing
> > > device even when there should be none on a real rotational device.
> > > 
> > > Random reads may be fast if they're hitting cached locations.
> > > 
> > > Random and sequential writes will be always cached if writeback is
> > > enabled and so there is no point in spending CPU cycles on
> > > optimizing writes.
> > > 
> > > How much the bcache device will behave like the backing device
> > > and how much like the caching device does depend mainly on the
> > > workload and the size of its working set compared to the size of
> > > the cache.
> > > 
> > > I do not believe that the choice of rotational=0 was arbitrary or
> > > a default. It's simply that bcache changes the access pattern to
> > > both the caching and backing device so much that it no longer
> > > resembles a rotational device's performance profile in any case.
> > >   
>  [...]  
> > 
> > Okay, that answers my questions. Thanks. :-)
> > 
> > But that only tells me that a "default" cannot be really chosen.
> > Both make sense.
> > 
> > I wonder if Linux chose to call the flag "non_rotational", would it
> > also default to 0 in bcache? I think nobody would know. ;-)
> > 
> > For me it looks like sticking that to rotational=1 gives overall
> > better long-time performance and btrfs filesystem layout.
> > 
> > Anyone who stumbles across this should judge on his own based on
> > Vojtech's good answer.  
> 
> Indeed!
> 
> Also note:
> 
> # cat /sys/block/bcache0/queue/scheduler 
> none

Yes, I know that.

> There is no scheduler for bcache, so the bio's pass through whatever
> your backing (cache) device uses as a queue scheduler, which could
> differ between cache/backing.

What does this exactly mean? I understand that depending on where the
bio ends up, I'm using two different IO schedulers.

At least this is how I currently set things up: I use different
scheduler (or different scheduler settings) to exploit exactly that
behavior.

> If you use hardware RAID, your
> 'rotational' flag is probably wrong for SSDs so set it on boot
> somehow (udev, etc.)

No hardware RAID involved here... Just three plain disks and one SSD.

I'm currently using udev to force it "1" for the bcache compound device
(which is what I guess the filesystem is seeing). The underlying bdev
and cdev still have their original rotational flag set, I didn't touch
it.


-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-05-10 20:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-04 21:24 Reasoning of exposing queue/rotational=0 Kai Krakow
2017-05-05 16:11 ` Coly Li
2017-05-05 17:44   ` Vojtech Pavlik
2017-05-05 18:23     ` Kai Krakow
2017-05-05 19:02       ` Vojtech Pavlik
2017-05-05 19:14         ` Kai Krakow
2017-05-09 18:11           ` Eric Wheeler
2017-05-10 20:18             ` Kai Krakow
2017-05-05 19:01     ` Kai Krakow
2017-05-05 18:04   ` Kai Krakow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.