On Wed, 14 Mar 2012, madhusudhana wrote: > João Eduardo Luís gmail.com> writes: > > On 03/13/2012 10:39 AM, madhusudhana wrote: > > > Hi all, > > > Can someone PLEASE help me to understand why data copying (simple cp > > > or rsync) is SO slow in CEPH ? And delete command (\rm -rf) for removing > > > a small (175M) directory is taking so much of time. Because of this, > > > i am not able to complete evaluation of ceph using my actual workload. > > > > > > I am hereby requesting all to help me in finding out the reason for > > > slow write/delete in my ceph cluster. I am really trying hard to > > > complete the evaluation with actual workload but not able to copy the > > > data whats required for evaluation to ceph cluster bcz of slow copy. > > > > > > I really appreciate any help on this. > > > > > > Thanks in advance > > > Madhusudhan > > > > > > > Hi Madhusudhan, > > > > Are you using Btrfs? And are you taking snapshots while you're > > performing those operations? > > > Joao, > Yes. all my OSD's are running on btrfs. I am not doing anything special > except running 'cp' or 'rsync' command. can you please let me know how > i can find if am taking snapshots while I do 'cp' or 'rsync' commands ? > > One more concern is 'rm' command is also TOO slow. I am a newbie here and > i don't know to much about ceph working. I configured it in a simple way. > Install ceph and run mkcephfs and start ceph in all the servers in the > cluster. > > I want to understand why basic operations like 'cp' and 'rm' are slow bcz > my actual workload does a lot of copy and delete operation. If I find the > reason for the slowness and get a fix, then I think i can better compare and > complete my ceph evaluation. There are a range of reasons why this could be slow, and unfortunately we don't have enough information to tell what is the problem. First, see what the OSD performance looks like. Output from, say, rados -p rbd bench 60 write rados -p rbd bench 60 write -b 4096 will give us a sense of what the object store performance looks like. The other big question is which client you're using.. since you have ceph-fuse problems, I assume you're using the kernel client? Which kernel version, etc. If it's MDS performance the way to diagnose would probably be to gather a message log initially ('debug ms = 1' on the [mds] section of ceph.conf) to see what the client/mds interaction looks like. MDS performance isn't something we have very much time to spend on right now, though, since our focus in primarily RADOS and RBD stability currently (unless you want to talk to the professional services side of things, info@ceph.com). The first step, though, is to figure out if this is an OSD or MDS performance issue... sage