All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	nirni-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
Subject: Re: [RFC PATCH 0/4] Sockperf: Initial multi-threaded throughput client
Date: Wed, 20 Dec 2017 10:52:39 +0200	[thread overview]
Message-ID: <20171220085239.GP2942@mtr-leonro.local> (raw)
In-Reply-To: <cover.1513609601.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3543 bytes --]

On Mon, Dec 18, 2017 at 10:26:42AM -0500, Doug Ledford wrote:
> During testing, it has become painfully clear that a single threaded UDP
> test client can not exercise a 100Gig link due to issues related to
> single core maximum throughput.  This patchset implements a
> multi-threaded throughput test in sockperf.  This is just an initial
> implementation, there is still more work to be done.  In particular:
>
> 1) Although the speed improved with this change, it did not improve
> drastically.  As soon as the client send bottleneck was removed, it
> became clear there is another bottleneck on the server.  When sending to
> a server from one client, all data is received on a single queue pair,
> and due to how interrupts are spread in the RDMA stack (namely that each
> queue pair goes to a single interrupt and we rely on multiple queue
> pairs being in use to balance interrupts across different cores), we
> take all interrupts from a specific host on a single core and the
> receiving side then becomes the bottleneck with single core IPoIB
> receive processing being the limiting factor.  On a slower machine, I
> clocked 30GBit/s throughput.  On a faster machine as the server, I was
> able to get up to 70GBit/s throughput.
>
> 2) I thought I might try an experiment to get around the queue pair is
> on one CPU issue.  We use P_Keys in our internal lab setup, and so on
> the specific link in question, I actually have a total of three
> different IP interfaces on different P_Keys.  I tried to open tests on
> multiple of these interfaces to see how that would impact performance
> (so a multithreaded server listening on ports on three different P_Key
> interfaces all on the same physical link, which should use three
> different queue pairs, and a multithreaded client sending to those three
> different P_Key interfaces from three different P_Key interfaces of its
> own).  It tanked it.  Like less than gigabit ethernet speeds.  This
> warrants some investigation moving forward I think.
>
> 3) I thought I might try sending from two clients to the server at once
> and summing their throughput.  That was fun.  With UDP the clients are
> able to send enough data that flow control on the link kicks in, at
> which point each client starts dropping packets on the floor (they're
> UDP after all), and so the net result is that one client claimed
> 200GBit/s and the other about 175GBit/s.  Meanwhile, the server thought
> we were just kidding and didn't actually run a test at all.
>
> 4) I reran the test using TCP instead of UDP.  That's a non-starter.
> Whether due to my changes, or just because it is the way it is, the TCP
> tests all failed.  For larger message sizes, they failed instantly.  For
> smaller message sizes the test might run for a few seconds, but would
> eventually fail too.  Always the failure was that the server would get a
> message it deemed too large and would forcibly close all of the TCP
> connections, at which point the client just bails.
>
> I should point out that I don't program C++.  Issues with me not doing
> these patches in a C++ typical manner are related to that.
>

Doug,

I contacted the group which is responsible for the development of sockperf
https://github.com/Mellanox/sockperf

Their maintainer is on vacation till third week of January and unluckily
there are no other people right now who can take a look onto your proposal.

In meanwhile, it is better to push the code to github, because their
development flow is based on github and not on mailing list.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2017-12-20  8:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-18 15:26 [RFC PATCH 0/4] Sockperf: Initial multi-threaded throughput client Doug Ledford
     [not found] ` <cover.1513609601.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-12-18 15:26   ` [PATCH 1/4] Rename a few variables Doug Ledford
2017-12-18 15:26   ` [PATCH 2/4] Move num-threads and cpu-affinity to common opts Doug Ledford
2017-12-18 15:26   ` [PATCH 3/4] Move server thread handler to SockPerf.cpp Doug Ledford
2017-12-18 15:26   ` [PATCH 4/4] Initial implementation of threaded throughput client Doug Ledford
2017-12-20  8:52   ` Leon Romanovsky [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171220085239.GP2942@mtr-leonro.local \
    --to=leon-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=nirni-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.