From: "Wunderlich, Mark" <firstname.lastname@example.org>
To: "email@example.com" <firstname.lastname@example.org>
Subject: [PATCH 0/2] allow for busy poll improvements
Date: Wed, 30 Oct 2019 22:27:02 +0000 [thread overview]
Message-ID: <B33B37937B7F3D4CB878107E305D4916D4AFB9@ORSMSX104.amr.corp.intel.com> (raw)
Proposing a small series of two patches that provide for improved packet processing for a fabric network interface that is operating with poll mode enabled
Patch 1: Modifies the do/while terminate condition in nvmet_tcp_io_work() to be time based vs. the existing code that is operations count based, which will match more to how it is performed for the host version of io_work(). If poll mode is not active then original function behavior is preserved.
Being time based allows setting the loop period to a value indicated by the network socket if CONFIG_NET_RX_BUSY_POLL is enabled [poll_mode]. Operating in poll_mode provides for increased opportunity to reap send or recv completions without premature exit when a single iteration of loop is idle (pending being false).
After exiting the do/while loop, the work item will be re-queued if there was previous 'pending' activity. In the case of poll mode, this is determined by any accumulated ops completions over the complete do/while period, vs if the last iteration through the loop showed successful activity.
Patch 2: This patch builds upon previous kernel network patches listed below that enabled enhanced symmetric queuing, which is leveraged while in poll_mode:
- a4fd1f4 Documentation: Add explanation for XPS using Rx-queue(s) map
- 8af2c06 net-sysfs: Add interface for Rx queue(s) map per Tx queue
- fc9bab2 net: Enable Tx queue selection based on Rx queues
- c6345ce net: Record receive queue number for a connection
Setting the socket priority to a non-zero value, via the proposed module parameter, will trigger indication to the network NIC that optimized network processing and queue selection can or should be considered. The default value for priority remains zero to support default NIC behavior related to priority.
When applied, and running with an optimized busy polling NIC, there is a measurable improvement in I/O performance and reduction in latency. The sample data below shows FIO results For 4K random read operations to a single remote nvme device. The queue depth Is 32 and the batch size is 8. One set of data represents a baseline for the standard Linux kernel running on the host and target (5.2.1 stable). In the second, the two patches were applied on the target but the NIC poll mode was not enabled. This to show default target performance is not impacted by adding the changes. And finally, the two Proposed patches were applied to the target, and poll mode was enabled.
For comparison, The number of threads for the fio job are scaled from 1 to 8 until we reach nvme device I/O saturation. The data shows no performance impact by adding the patches and run in default non-polled mode. While in polled mode device saturation is reached faster with lower average and 99.99 latency.
Sample FIO invocation line:
fio --filename=/dev/nvme0n1 --time_based --thread --runtime=60 --ramp_time=10 --rw=randrw --rwmixread=100 --refill_buffers --direct=1 --ioengine=libaio --bs=4k --iodepth=32 --iodepth_batch_complete_min=1 --iodepth_batch_complete_max=32 --iodepth_batch=8 --numjobs=1 --group_reporting --gtod_reduce=0 --disable_lat=0 --name=cpu3 --cpus_allowed=3
Baseline 5.2.1 stable kernel:
Threads | IOPS (K) | Avg Lat (usec) | 99.99 (usec)
1 | 195 | 157.69 | 553
2 | 215 | 284.31 | 758
3 | 404 | 229.04 | 742
4 | 515 | 239.98 | 742
5 | 549 | 282.22 | 750
6 | 581 | 321.64 | 750
7 | 587 | 376.68 | 750
8 | 587 | 432.58 | 971
With patches applied on Target kernel,
But poll mode off, so_priority set to 0.
1 | 197 | 156.40 | 545
2 | 286 | 215.14 | 758
3 | 422 | 218.75 | 734
4 | 491 | 251.34 | 742
5 | 504 | 306.68 | 750
6 | 583 | 319.91 | 766
7 | 587 | 378.32 | 742
8 | 587 | 434.57 | 660
With patches applied on Target kernel,
Poll mode enabled:
1 | 227 | 129.43 | 537
2 | 404 | 144.33 | 537
3 | 523 | 170.25 | 529
4 | 563 | 215.94 | 537
5 | 587 | 263.78 | 510
6 | 587 | 321.41 | 506
7 | 587 | 378.67 | 502
8 | 587 | 435.27 | 502
Note: Data was gathered using kernel 5.2.1 stable. The patches posted to this mailing list were merged to the infradead tree pull 10/30/19.
Mark Wunderlich (2):
- nvmet-tcp: enable polling option in io_work
- nvmet-tcp: set SO_PRIORITY for accepted sockets
Linux-nvme mailing list
next reply other threads:[~2019-10-30 22:27 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-30 22:27 Wunderlich, Mark [this message]
2019-11-06 16:44 ` [PATCH 0/2] allow for busy poll improvements Sagi Grimberg
2019-11-07 16:09 ` Wunderlich, Mark
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).