hey bernd, long time no chat.  it turns out you don't have to know what
swift is because I've been able to demonstrate this behavior with a very
simple python script that simply creates files in a 3-tier hierarchy.  the
third level directories each contain a single file which for my testing are
all 1K.

I have played wiht cache_pressure and it doesn't seem to make a difference,
though that was awhlle ago and perhaps it is worth revisiting. one thing
you may get a hoot out of, being a collectl user, is I have an xfs plugin
that lets you look at a ton of xfs stats either in realtime or after the
fact just like any other collectl stat.  I just havent' added it to the kit
yet.

-mark

On Mon, Jan 25, 2016 at 1:24 PM, Bernd Schubert <bschubert@ddn.com> wrote:

> Hi Mark!
>
> On 01/06/2016 04:15 PM, Mark Seger wrote:
> > I've recently found the performance our development swift system is
> > degrading over time as the number of objects/files increases.  This is a
> > relatively small system, each server has 3 400GB disks.  The system I'm
> > currently looking at has about 70GB tied up in slabs alone, close to 55GB
> > in xfs inodes and ili, and about 2GB free.  The kernel
> > is 3.14.57-1-amd64-hlinux.
> >
> > Here's the way the filesystems are mounted:
> >
> > /dev/sdb1 on /srv/node/disk0 type xfs
> >
> (rw,noatime,nodiratime,attr2,nobarrier,inode64,logbufs=8,logbsize=256k,sunit=512,swidth=1536,noquota)
> >
> > I can do about 2000 1K file creates/sec when running 2 minute PUT tests
> at
> > 100 threads.  If I repeat that tests for multiple hours, I see the number
> > of IOPS steadily decreasing to about 770 and the very next run it drops
> to
> > 260 and continues to fall from there.  This happens at about 12M files.
> >
> > The directory structure is 2 tiered, with 1000 directories per tier so we
> > can have about 1M of them, though they don't currently all exist.
>
> This sounds pretty much like hash directories as used by some parallel
> file systems (Lustre and in the past BeeGFS). For us the file create
> slow down was due to lookup in directories if a file with the same name
> already exists. At least for ext4 it was rather easy to demonstrate that
> simply caching directory blocks would eliminate that issue.
> We then considered working on a better kernel cache, but in the end
> simply found a way to get rid of such a simple directory structure in
> BeeGFS and changed it to a more complex layout, but with less random
> access and so we could eliminate the main reason for the slow down.
>
> Now I have no idea what a "swift system" is and in which order it
> creates and accesses those files and if it would be possible to change
> the access pattern. One thing you might try and which should work much
> better since 3.11 is the vfs_cache_pressure setting. The lower it is the
> less dentries/inodes are dropped from cache when pages are needed for
> file data.
>
>
>
> Cheers,
> Bernd