Problem with slow operation on xattr

* Problem with slow operation on xattr
@ 2014-07-08  8:17 nyao
  2014-07-08 15:54 ` Sage Weil
  0 siblings, 1 reply; 2+ messages in thread
From: nyao @ 2014-07-08  8:17 UTC (permalink / raw)
  To: ceph-devel

Dear all developers,

I use the rbd kernel module on the client-end, and when we test the  
random write performance. The throughput is quit poor and always drops  
to zero.

And I trace the development logs on the server-side and find that it  
is always blocked in the function: get_object_context, getattr() and  
_setattrs. The average time os about hundreds of milliseconds. Even  
bad, the maximum latency is up to 4-6 seconds, so the throughput  
observed on the client-side is always blocked several seconds. This is  
really ruining the performance of the cluster.

Therefore, I carefully analyze those functions mentioned above  
(get_object_context, getattr() and _setattrs). I cannot find any  
blocked code except for the system calls for xattr like (fgetattr,  
fsetattr, flistattr).

On the OSD node, I use the xfs file system as the underlying osd file  
system. And by default, it will use the extend attribute feature of  
the xfs to store ceph.user xattr (??_?? and ??snapset??). Since those  
system calls are synchronized function call, I set the io-scheduler of  
the disk to [Deadline] so that no reading meta-data will be blocked a  
long time before it will be served. However, even though, the  
performance is still quite poor and those functions mentioned above  
are still blocked, sometimes, up to several seconds.

Therefore, I wanna know that how to solve this problem, does ceph  
provide any user-space cache for xattr?

Does this problem caused by xfs file-system, its xattr system calls?

Furthermore, I try to stop the feature of xfs xattr by setting  
??filestore_max_inline_xattrs_xfs = 0?? &&  
??filestore_max_inline_xattr_size_xfs = 0??. So the xattr key/value  
pair will be stored in omap implemented by LevelDB. It solves the  
problem a bit, the maximum blocked interval drops to about 1-2 second.  
But if the xattr read from the physical disk not the page cache, it  
still quite slow.
So I wonder that is it a good idea to cache all xattr data in  
use_space cache as for xattr, ??_??, the length is just 242 bytes if  
we use xfs file-system? For hundred thousands of Objects, it will cost  
just less than 100MB.

Best Regards,
Neal Yao

^ permalink raw reply	[flat|nested] 2+ messages in thread