linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Sven Breuner <sven@excelero.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: Remove single NFS client performance bottleneck: Only 4 nfsd active
Date: Mon, 27 Jan 2020 09:12:10 -0500	[thread overview]
Message-ID: <2D125E5E-F97D-4F52-89A9-C499CC7E7A5D@oracle.com> (raw)
In-Reply-To: <F82F3FA8-4E2A-4014-A052-A0367562A956@oracle.com>



> On Jan 27, 2020, at 9:06 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> Hi Sven-
> 
>> On Jan 26, 2020, at 6:41 PM, Sven Breuner <sven@excelero.com> wrote:
>> 
>> Hi,
>> 
>> I'm using the kernel NFS client/server and am trying to read as many small files per second as possible from a single NFS client, but seem to run into a bottleneck.
>> 
>> Maybe this is just a tunable that I am missing, because the CPUs on client and server side are mostly idle, the 100Gbit (RoCE) network links between client and server are also mostly idle and the NVMe drives in the server are also mostly idle (and the server has enough RAM to easily fit my test data set in the ext4/xfs page cache, but also a 2nd read of the data set from the RAM cache doesn't change the result much).
>> 
>> This is my test case:
>> # Create 1.6M 10KB files through 128 mdtest processes in different directories...
>> $ mpirun -hosts localhost -np 128 /path/to/mdtest -F -d /mnt/nfs/mdtest -i 1 -I 100 -z 1 -b 128 -L -u -w 10240 -e 10240 -C
>> 
>> # Read all the files through 128 mdtest processes (the case that matters primarily for my test)...
>> $ mpirun -hosts localhost -np 128 /path/to/mdtest -F -d /mnt/nfs/mdtest -i 1 -I 100 -z 1 -b 128 -L -u -w 10240 -e 10240 -E
>> 
>> The result is about 20,000 file reads per sec, so only ~200MB/s network throughput.
> 
> What is the typical size of the NFS READ I/Os on the wire?
> 
> Are you sure your mpirun workload is generating enough parallelism?

A couple of other thoughts:

What's the client hardware like? NUMA? Fast memory? CPU count?
Have you configured device interrupt affinity and used tuned
to disable CPU sleep states, etc?

Have you properly configured your 100GbE switch and cards?

I have a Mellanox SN2100 here and two hosts with CX-5 Ethernet.
The configuration of the cards and switch is critical to good
performance.


>> I noticed in "top" that only 4 nfsd processes are active, so I'm wondering why the load is not spread across more of my 64 /proc/fs/nfsd/threads, but even the few nfsd processes that are active use less than 50% of their core each. The CPUs are shown as >90% idle in "top" on client and server during the read phase.
>> 
>> I've tried:
>> * CentOS 7.5 and 7.6 kernels (3.10.0-...) on client and server; and Ubuntu 18 with 4.18 kernel on server side
>> * TCP & RDMA
>> * Mounted as NFSv3/v4.1/v4.2
>> * Increased tcp_slot_table_entries to 1024
>> 
>> ...but all that didn't change the fact that only 4 nfsd processes are active on the server and thus I'm getting the same result already if /proc/fs/nfsd/threads is set to only 4 instead of 64.
>> 
>> Any pointer to how I can overcome this limit will be greatly appreciated.
>> 
>> Thanks in advance
>> 
>> Sven
>> 
> 
> --
> Chuck Lever

--
Chuck Lever




  reply	other threads:[~2020-01-27 14:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-26 23:41 Remove single NFS client performance bottleneck: Only 4 nfsd active Sven Breuner
2020-01-27 14:06 ` Chuck Lever
2020-01-27 14:12   ` Chuck Lever [this message]
2020-01-27 17:27     ` Sven Breuner
2020-01-27 17:45       ` Chuck Lever
2020-01-28 23:22         ` Sven Breuner
2020-01-29  0:43           ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2D125E5E-F97D-4F52-89A9-C499CC7E7A5D@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=sven@excelero.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).