From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sage Weil Subject: Re: read performance not perfect Date: Mon, 18 Jul 2011 10:14:02 -0700 (PDT) Message-ID: References: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Return-path: Received: from cobra.newdream.net ([66.33.216.30]:39881 "EHLO cobra.newdream.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754056Ab1GRRK3 (ORCPT ); Mon, 18 Jul 2011 13:10:29 -0400 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: huang jun Cc: ceph-devel On Mon, 18 Jul 2011, huang jun wrote: > hi,all > We test ceph's read performance last week, and find something weird > we use ceph v0.30 on linux 2.6.37 > mount ceph on back-platform consist of 2 osds \1 mon \1 mds > $mount -t ceph 192.168.1.103:/ /mnt -vv > $ dd if=/dev/zero of=/mnt/test bs=4M count=200 > $ cd .. && umount /mnt > $mount -t ceph 192.168.1.103:/ /mnt -vv > $dd if=test of=/dev/zero bs=4M > 200+0 records in > 200+0 records out > 838860800 bytes (839 MB) copied, 16.2327 s, 51.7 MB/s > but if we use rados to test it > $ rados -m 192.168.1.103:6789 -p data bench 60 write > $ rados -m 192.168.1.103:6789 -p data bench 60 seq > the result is: > Total time run: 24.733935 > Total reads made: 438 > Read size: 4194304 > Bandwidth (MB/sec): 70.834 > > Average Latency: 0.899429 > Max latency: 1.85106 > Min latency: 0.128017 > this phenomenon attracts our attention, then we begin to analysis the > osd debug log. > we find that : > 1) the kernel client send READ request, at first it requests 1MB, and > after that it is 512KB > 2) from rados test cmd log, OSD recept the READ op with 4MB data to handle > we know the ceph developers pay their attention to read and write > performance, so i just want to confrim that > if the communication between the client and OSD spend more time than > it should be? can we request bigger size, just like default object > size 4MB, when it occurs to READ operation? or this is related to OS > management, if so, what can we do to promote the performance? I think it's related to the way the Linux VFS is doing readahead, and how the ceph fs code is handling it. It's issue #1122 in the tracker and I plan to look at it today or tomorrow! Thanks- sage