* [patch] block: make struct block_device cacheline_aligned
@ 2012-09-19 13:50 Jeff Moyer
2012-09-19 14:59 ` Jeff Moyer
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2012-09-19 13:50 UTC (permalink / raw)
To: jens.axboe; +Cc: Mikulas Patocka, LKML
Hi,
When testing against a pcie ssd or a ramdisk, making the block device
structure cacheline_aligned provided a significant increase in
performance:
vanilla
READ WRITE CPU
Job Name BW IOPS msec BW IOPS msec usr sys csw
write1 0 0 0 748522 187130 44864 16.34 60.65 3799440.00
read1 690615 172653 48602 0 0 0 13.45 61.42 4044720.00
randwrite1 0 0 0 716406 179101 46839 29.03 52.79 3151140.00
randread1 683466 170866 49108 0 0 0 25.92 54.67 3081610.00
readwrite1 377518 94379 44450 377645 94410 44450 15.49 64.32 3139240.00
randrw1 355815 88953 47178 355733 88933 47178 27.96 54.24 2944570.00
patched
READ WRITE CPU
Job Name BW IOPS msec BW IOPS msec usr sys csw
write1 0 0 0 871355 217838 38508 17.49 42.46 1642870.00
read1 1418560 354639 23675 0 0 0 14.96 54.75 337489.00
randwrite1 0 0 0 736970 184242 45633 30.62 35.25 1409440.00
randread1 1065440 266359 31544 0 0 0 32.67 43.74 255394.00
readwrite1 657940 164484 25867 657848 164461 25867 18.54 50.55 619474.00
randrw1 491940 122985 34245 492014 123003 34245 34.44 41.05 418999.00
%diff
READ WRITE CPU
Job Name BW IOPS msec BW IOPS msec usr sys csw
write1 0 0 0 16 16 -14 7.04 -29.99 -56.76
read1 105 105 -51 0 0 0 11.23 -10.86 -91.66
randwrite1 0 0 0 0 0 0 5.48 -33.23 -55.27
randread1 55 55 -35 0 0 0 26.04 -19.99 -91.71
readwrite1 74 74 -41 74 74 -41 19.69 -21.41 -80.27
randrw1 38 38 -27 38 38 -27 23.18 -24.32 -85.77
BW=bandwidth in KB/s
IOPS = I/Os per second
msec = # of miliseconds the run took (lower is better)
usr = % user time
sys = % system time
csw = # of context switches
The test is doing asynchronous direct I/O to the block device using 4
processes each driving a queue depth of 1024 to a different part of the
disk. The rows, in order, are sequential write, sequential read,
random write, random read, 50% mix of sequential reads and sequential
writes, 50% mix of random reads and random writes. The block size in
all cases is 4k.
I'd appreciate it if others could verify an increase in performance with
this patch. Thanks to Mikulas for initially suggesting that the cache
size/alignment was relevant to performance.
Cheers,
Jeff
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aa11047..87ce6ca 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -724,7 +724,7 @@ struct block_device {
int bd_fsfreeze_count;
/* Mutex for freeze */
struct mutex bd_fsfreeze_mutex;
-};
+} __cacheline_aligned;
/*
* Radix-tree tags, for tagging dirty and writeback pages within the pagecache
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 13:50 [patch] block: make struct block_device cacheline_aligned Jeff Moyer
@ 2012-09-19 14:59 ` Jeff Moyer
2012-09-19 15:16 ` Mikulas Patocka
2012-09-19 15:32 ` Richard Kennedy
0 siblings, 2 replies; 8+ messages in thread
From: Jeff Moyer @ 2012-09-19 14:59 UTC (permalink / raw)
To: jaxboe; +Cc: Mikulas Patocka, LKML
Jeff Moyer <jmoyer@redhat.com> writes:
> Hi,
>
> When testing against a pcie ssd or a ramdisk, making the block device
> structure cacheline_aligned provided a significant increase in
> performance:
Self-NACK on this one. This results in a ton of warnings:
include/linux/fs.h:727: warning: ???__section__??? attribute does not
apply to types
In file included from include/linux/debugfs.h:18,
from kernel/trace/trace_probe.h:28,
from kernel/trace/trace_kprobe.c:23:
include/linux/fs.h:727: warning: ???__section__??? attribute does not
apply to types
And that leaves me with the task of figuring out if/why this actually
helps.
Cheers,
Jeff
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 14:59 ` Jeff Moyer
@ 2012-09-19 15:16 ` Mikulas Patocka
2012-09-19 15:24 ` Jeff Moyer
2012-09-19 15:32 ` Richard Kennedy
1 sibling, 1 reply; 8+ messages in thread
From: Mikulas Patocka @ 2012-09-19 15:16 UTC (permalink / raw)
To: Jeff Moyer; +Cc: jaxboe, LKML
On Wed, 19 Sep 2012, Jeff Moyer wrote:
> Jeff Moyer <jmoyer@redhat.com> writes:
>
> > Hi,
> >
> > When testing against a pcie ssd or a ramdisk, making the block device
> > structure cacheline_aligned provided a significant increase in
> > performance:
>
> Self-NACK on this one. This results in a ton of warnings:
>
> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> apply to types
> In file included from include/linux/debugfs.h:18,
> from kernel/trace/trace_probe.h:28,
> from kernel/trace/trace_kprobe.c:23:
> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> apply to types
>
> And that leaves me with the task of figuring out if/why this actually
> helps.
>
> Cheers,
> Jeff
Hi
Use ____cacheline_aligned instead of __cacheline_aligned
Mikulas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 15:16 ` Mikulas Patocka
@ 2012-09-19 15:24 ` Jeff Moyer
2012-09-19 15:50 ` Mikulas Patocka
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2012-09-19 15:24 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: jaxboe, LKML
Mikulas Patocka <mpatocka@redhat.com> writes:
> On Wed, 19 Sep 2012, Jeff Moyer wrote:
>
>> Jeff Moyer <jmoyer@redhat.com> writes:
>>
>> > Hi,
>> >
>> > When testing against a pcie ssd or a ramdisk, making the block device
>> > structure cacheline_aligned provided a significant increase in
>> > performance:
>>
>> Self-NACK on this one. This results in a ton of warnings:
>>
>> include/linux/fs.h:727: warning: ???__section__??? attribute does not
>> apply to types
>> In file included from include/linux/debugfs.h:18,
>> from kernel/trace/trace_probe.h:28,
>> from kernel/trace/trace_kprobe.c:23:
>> include/linux/fs.h:727: warning: ???__section__??? attribute does not
>> apply to types
>>
>> And that leaves me with the task of figuring out if/why this actually
>> helps.
>>
>> Cheers,
>> Jeff
>
> Hi
>
> Use ____cacheline_aligned instead of __cacheline_aligned
struct block_device is allocated as part of the bdev_inode:
struct bdev_inode {
struct block_device bdev;
struct inode vfs_inode;
};
The bdev_inode is allocated from the bdev_cachep, which uses
SLAB_HWCACHE_ALIGN. So, in theory, this should already be aligned.
-Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 14:59 ` Jeff Moyer
2012-09-19 15:16 ` Mikulas Patocka
@ 2012-09-19 15:32 ` Richard Kennedy
1 sibling, 0 replies; 8+ messages in thread
From: Richard Kennedy @ 2012-09-19 15:32 UTC (permalink / raw)
To: Jeff Moyer; +Cc: jaxboe, Mikulas Patocka, LKML
On 19/09/12 15:59, Jeff Moyer wrote:
> Jeff Moyer <jmoyer@redhat.com> writes:
>
>> Hi,
>>
>> When testing against a pcie ssd or a ramdisk, making the block device
>> structure cacheline_aligned provided a significant increase in
>> performance:
>
> Self-NACK on this one. This results in a ton of warnings:
>
> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> apply to types
> In file included from include/linux/debugfs.h:18,
> from kernel/trace/trace_probe.h:28,
> from kernel/trace/trace_kprobe.c:23:
> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> apply to types
>
> And that leaves me with the task of figuring out if/why this actually
> helps.
>
> Cheers,
> Jeff
>>
Hi,
block_device sits in bdev_inode which is already cache line aligned as
they're allocated in a kmem_cache with HWALIGN set.
So I wonder if swapping the order of bdev & inode in bdev_inode will help?
So it becomes this :-
struct bdev_inode {
struct inode vfs_inode;
struct block_device bdev;
};
regards
Richard
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 15:24 ` Jeff Moyer
@ 2012-09-19 15:50 ` Mikulas Patocka
2012-09-19 16:44 ` Jeff Moyer
0 siblings, 1 reply; 8+ messages in thread
From: Mikulas Patocka @ 2012-09-19 15:50 UTC (permalink / raw)
To: Jeff Moyer; +Cc: jaxboe, LKML
On Wed, 19 Sep 2012, Jeff Moyer wrote:
> Mikulas Patocka <mpatocka@redhat.com> writes:
>
> > On Wed, 19 Sep 2012, Jeff Moyer wrote:
> >
> >> Jeff Moyer <jmoyer@redhat.com> writes:
> >>
> >> > Hi,
> >> >
> >> > When testing against a pcie ssd or a ramdisk, making the block device
> >> > structure cacheline_aligned provided a significant increase in
> >> > performance:
> >>
> >> Self-NACK on this one. This results in a ton of warnings:
> >>
> >> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> >> apply to types
> >> In file included from include/linux/debugfs.h:18,
> >> from kernel/trace/trace_probe.h:28,
> >> from kernel/trace/trace_kprobe.c:23:
> >> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> >> apply to types
> >>
> >> And that leaves me with the task of figuring out if/why this actually
> >> helps.
> >>
> >> Cheers,
> >> Jeff
> >
> > Hi
> >
> > Use ____cacheline_aligned instead of __cacheline_aligned
>
> struct block_device is allocated as part of the bdev_inode:
>
> struct bdev_inode {
> struct block_device bdev;
> struct inode vfs_inode;
> };
>
> The bdev_inode is allocated from the bdev_cachep, which uses
> SLAB_HWCACHE_ALIGN. So, in theory, this should already be aligned.
>
> -Jeff
The purpose here is to align vfs_inode. If you add alignment to bdev,
vfs_inode would be aligned (because bdev size would be aligned to
cacheline boundary). Or you can add the alignment to vfs_inode, it would
have the same effect.
Mikulas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 15:50 ` Mikulas Patocka
@ 2012-09-19 16:44 ` Jeff Moyer
2012-09-20 0:11 ` Mikulas Patocka
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2012-09-19 16:44 UTC (permalink / raw)
To: Mikulas Patocka, Richard Kennedy; +Cc: jaxboe, LKML
Mikulas Patocka <mpatocka@redhat.com> writes:
> On Wed, 19 Sep 2012, Jeff Moyer wrote:
>
>> Mikulas Patocka <mpatocka@redhat.com> writes:
>>
>> > On Wed, 19 Sep 2012, Jeff Moyer wrote:
>> >
>> >> Jeff Moyer <jmoyer@redhat.com> writes:
>> >>
>> >> > Hi,
>> >> >
>> >> > When testing against a pcie ssd or a ramdisk, making the block device
>> >> > structure cacheline_aligned provided a significant increase in
>> >> > performance:
>> >>
>> >> Self-NACK on this one. This results in a ton of warnings:
>> >>
>> >> include/linux/fs.h:727: warning: ???__section__??? attribute does not
>> >> apply to types
>> >> In file included from include/linux/debugfs.h:18,
>> >> from kernel/trace/trace_probe.h:28,
>> >> from kernel/trace/trace_kprobe.c:23:
>> >> include/linux/fs.h:727: warning: ???__section__??? attribute does not
>> >> apply to types
>> >>
>> >> And that leaves me with the task of figuring out if/why this actually
>> >> helps.
>> >>
>> >> Cheers,
>> >> Jeff
>> >
>> > Hi
>> >
>> > Use ____cacheline_aligned instead of __cacheline_aligned
>>
>> struct block_device is allocated as part of the bdev_inode:
>>
>> struct bdev_inode {
>> struct block_device bdev;
>> struct inode vfs_inode;
>> };
>>
>> The bdev_inode is allocated from the bdev_cachep, which uses
>> SLAB_HWCACHE_ALIGN. So, in theory, this should already be aligned.
>>
>> -Jeff
>
> The purpose here is to align vfs_inode.
I'm not sure I understand what you're saying. When you say, "The
purpose here," do you mean the purpose in the existing code or the
purpose of our changes? The existing code seems to want to align the
struct block_device, so I assume you mean we should instead align the
vfs_inode.
> If you add alignment to bdev, vfs_inode would be aligned (because bdev
> size would be aligned to cacheline boundary).
ITYM because the bdev size *is* aligned to a cacheline boundary (the
size is 256 in my kernel, and the cache line alignment for this cpu is
64). But, since the entire structure is already aligned by the slab
allocator, I don't see how adding ____cacheline_aligned would change
anything.
> Or you can add the alignment to vfs_inode, it would have the same
> effect.
Well, I tried the suggestion by Richard to swap the fields in the
bdev_inode, and it did not result in a huge performance gain:
%diff
READ WRITE CPU
Job Name BW IOPS msec BW IOPS msec usr sys csw
write1 0 0 0 9 9 -8 0.00 0.00 -17.18
read1 6 6 -6 0 0 0 5.87 0.00 -19.20
randwrite1 0 0 0 0 0 0 0.00 0.00 -15.46
randread1 5 5 -5 0 0 0 0.00 0.00 -13.69
readwrite1 0 0 0 0 0 0 0.00 0.00 7.83
randrw1 5 5 -5 5 5 -5 0.00 0.00 -12.29
I can try adding the ____cacheline_aligned to the vfs_inode inside of
the bdev_inode if you like. Any other ideas?
Cheers,
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] block: make struct block_device cacheline_aligned
2012-09-19 16:44 ` Jeff Moyer
@ 2012-09-20 0:11 ` Mikulas Patocka
0 siblings, 0 replies; 8+ messages in thread
From: Mikulas Patocka @ 2012-09-20 0:11 UTC (permalink / raw)
To: Jeff Moyer; +Cc: Richard Kennedy, jaxboe, LKML
On Wed, 19 Sep 2012, Jeff Moyer wrote:
> Mikulas Patocka <mpatocka@redhat.com> writes:
>
> > On Wed, 19 Sep 2012, Jeff Moyer wrote:
> >
> >> Mikulas Patocka <mpatocka@redhat.com> writes:
> >>
> >> > On Wed, 19 Sep 2012, Jeff Moyer wrote:
> >> >
> >> >> Jeff Moyer <jmoyer@redhat.com> writes:
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > When testing against a pcie ssd or a ramdisk, making the block device
> >> >> > structure cacheline_aligned provided a significant increase in
> >> >> > performance:
> >> >>
> >> >> Self-NACK on this one. This results in a ton of warnings:
> >> >>
> >> >> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> >> >> apply to types
> >> >> In file included from include/linux/debugfs.h:18,
> >> >> from kernel/trace/trace_probe.h:28,
> >> >> from kernel/trace/trace_kprobe.c:23:
> >> >> include/linux/fs.h:727: warning: ???__section__??? attribute does not
> >> >> apply to types
> >> >>
> >> >> And that leaves me with the task of figuring out if/why this actually
> >> >> helps.
> >> >>
> >> >> Cheers,
> >> >> Jeff
> >> >
> >> > Hi
> >> >
> >> > Use ____cacheline_aligned instead of __cacheline_aligned
> >>
> >> struct block_device is allocated as part of the bdev_inode:
> >>
> >> struct bdev_inode {
> >> struct block_device bdev;
> >> struct inode vfs_inode;
> >> };
> >>
> >> The bdev_inode is allocated from the bdev_cachep, which uses
> >> SLAB_HWCACHE_ALIGN. So, in theory, this should already be aligned.
> >>
> >> -Jeff
> >
> > The purpose here is to align vfs_inode.
>
> I'm not sure I understand what you're saying. When you say, "The
> purpose here," do you mean the purpose in the existing code or the
> purpose of our changes? The existing code seems to want to align the
> struct block_device, so I assume you mean we should instead align the
> vfs_inode.
>
> > If you add alignment to bdev, vfs_inode would be aligned (because bdev
> > size would be aligned to cacheline boundary).
>
> ITYM because the bdev size *is* aligned to a cacheline boundary (the
> size is 256 in my kernel, and the cache line alignment for this cpu is
> 64). But, since the entire structure is already aligned by the slab
> allocator, I don't see how adding ____cacheline_aligned would change
> anything.
>
> > Or you can add the alignment to vfs_inode, it would have the same
> > effect.
>
> Well, I tried the suggestion by Richard to swap the fields in the
> bdev_inode, and it did not result in a huge performance gain:
>
> %diff
> READ WRITE CPU
> Job Name BW IOPS msec BW IOPS msec usr sys csw
> write1 0 0 0 9 9 -8 0.00 0.00 -17.18
> read1 6 6 -6 0 0 0 5.87 0.00 -19.20
> randwrite1 0 0 0 0 0 0 0.00 0.00 -15.46
> randread1 5 5 -5 0 0 0 0.00 0.00 -13.69
> readwrite1 0 0 0 0 0 0 0.00 0.00 7.83
> randrw1 5 5 -5 5 5 -5 0.00 0.00 -12.29
>
> I can try adding the ____cacheline_aligned to the vfs_inode inside of
> the bdev_inode if you like. Any other ideas?
>
> Cheers,
> Jeff
I added ____cacheline_aligned - the results are this:
3.5.4 (without ____cacheline_aligned): 60.0630s, stdev 0.9087
3.5.4, ____cacheline_aligned: 43.1266s, stdev 0.8916
3.5.4, ____cacheline_aligned, patch 1: 42.7746s, stdev 0.3491
3.5.4, ____cacheline_aligned, patch 2: 45.1152s, stdev 0.8554
3.5.4, ____cacheline_aligned, patch 3: 43.2462s, stdev 0.1946
3.5.4, ____cacheline_aligned, patch 4: 42.8494s, stdev 0.2387
--- so, cacheline_aligned makes the results consistent. Patch 4 looks
slighly better than patch 3. What are your results with cacheline_aligned?
Mikulas
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-09-20 0:11 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-19 13:50 [patch] block: make struct block_device cacheline_aligned Jeff Moyer
2012-09-19 14:59 ` Jeff Moyer
2012-09-19 15:16 ` Mikulas Patocka
2012-09-19 15:24 ` Jeff Moyer
2012-09-19 15:50 ` Mikulas Patocka
2012-09-19 16:44 ` Jeff Moyer
2012-09-20 0:11 ` Mikulas Patocka
2012-09-19 15:32 ` Richard Kennedy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).