linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ashish Samant <ashish.samant@oracle.com>
To: Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	fuse-devel@lists.sourceforge.net,
	Srinivas Eeda <srinivas.eeda@oracle.com>
Subject: Re: fuse scalability part 1
Date: Thu, 24 Sep 2015 12:17:58 -0700	[thread overview]
Message-ID: <56044C66.1090207@oracle.com> (raw)
In-Reply-To: <20150518151336.GA9960@tucsk>

[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]


On 05/18/2015 08:13 AM, Miklos Szeredi wrote:
> This part splits out an "input queue" and a "processing queue" from the
> monolithic "fuse connection", each of those having their own spinlock.
>
> The end of the patchset adds the ability to "clone" a fuse connection.  This
> means, that instead of having to read/write requests/answers on a single fuse
> device fd, the fuse daemon can have multiple distinct file descriptors open.
> Each of those can be used to receive requests and send answers, currently the
> only constraint is that a request must be answered on the same fd as it was read
> from.
>
> This can be extended further to allow binding a device clone to a specific CPU
> or NUMA node.
>
> Patchset is available here:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git for-next
>
> Libfuse patches adding support for "clone_fd" option:
>
>    git://git.code.sf.net/p/fuse/fuse clone_fd
>
> Thanks,
> Miklos
>
>
Resending the numbers as attachments because my email client messes the 
formatting of the message. Sorry for the noise.

We did some performance testing without these patches and with these 
patches (with -o clone_fd  option specified). We did 2 types of tests:

1. Throughput test : We did some parallel dd tests to read/write to FUSE 
based database fs on a system with 8 numa nodes and 288 cpus. The 
performance here is almost equal to the the per-numa patches we 
submitted a while back.Please find results attached.

2. Spinlock access times test: We also ran some tests within the kernel 
to check the time spent in accessing the spinlocks per request in both 
cases. As can be seen, the time taken per request to access the spinlock 
in the kernel code throughout the lifetime of the request is 30X to 100X 
better in the 2nd case (with patchset). Please find results attached.

Thanks,
Ashish



[-- Attachment #2: dd_test_results.txt --]
[-- Type: text/plain, Size: 1274 bytes --]

1) Writes to single mount

dd processes                 throughput(without patchset)            throughput(with patchset)
in parallel

4                                633 Mb/s                               606 Mb/s
8                                583.2 Mb/s                             561.6 Mb/s
16                               436 Mb/s                               640.6 Mb/s
32                               500.5 Mb/s                             718.1 Mb/s
64                               440.7 Mb/s                             1276.8 Mb/s
128                              526.2 Mb/s                             2343.4 Mb/s

2) Reading from single mount
 
dd processes                 throughput(without patchset)            throughput(with patchset)
in parallel

4                               1171 Mb/s                               1059 Mb/s
8                               1626 Mb/s                               1677 Mb/s
16                              1014 Mb/s                               2240.6 Mb/s
32                              807.6 Mb/s                              2512.9 Mb/s
64                              985.8 Mb/s                              2870.3 Mb/s
128                             1355 Mb/s                               2996.5 Mb/s 

[-- Attachment #3: spinlock_access_time_test.txt --]
[-- Type: text/plain, Size: 580 bytes --]

dd processes                  Time/req(without patchset)            Time/req(with patchset)
in parallel

4                                0.025 ms                            0.00685 ms
8                                0.174 ms                            0.0071 ms
16                               0.9825 ms                           0.0115 ms
32                               2.4965 ms                           0.0315 ms
64                               4.8335 ms                           0.071 ms
128                              5.972 ms                            0.1812 ms 

  parent reply	other threads:[~2015-09-24 19:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-18 15:13 fuse scalability part 1 Miklos Szeredi
2015-09-24  1:13 ` Ashish Samant
     [not found] ` <20150814101453.GB31364@frosties>
2015-09-24  6:30   ` [fuse-devel] " Miklos Szeredi
2015-09-24 19:17 ` Ashish Samant [this message]
2015-09-25 12:11   ` Miklos Szeredi
2015-09-25 17:53     ` Ashish Samant
2015-09-29  6:18     ` Srinivas Eeda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56044C66.1090207@oracle.com \
    --to=ashish.samant@oracle.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=srinivas.eeda@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).