From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: [PATCH V2 net-next 0/7] RDS: zerocopy support Date: Wed, 14 Feb 2018 02:28:29 -0800 Message-ID: Cc: davem@davemloft.net, rds-devel@oss.oracle.com, sowmini.varadhan@oracle.com, santosh.shilimkar@oracle.com To: netdev@vger.kernel.org, willemdebruijn.kernel@gmail.com Return-path: Received: from aserp2120.oracle.com ([141.146.126.78]:50920 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S967174AbeBNKrl (ORCPT ); Wed, 14 Feb 2018 05:47:41 -0500 Sender: netdev-owner@vger.kernel.org List-ID: This is version 2 of the series at https://www.mail-archive.com/netdev@vger.kernel.org/msg213829.html Review comments addressed Patch 4: - make sure to always sock_put m_rs even if there is no znotifier. - major rewrite of notification, resulting in much simplification. Patch 5: - remove unused data_len argument to rds_rm_size; - unmap as necessary if we fail in the middle of zerocopy setup Patch 7: - restructured do_recv_completion to avoid excessive code re-indent - on-stack allocation of cmsghdr for cookie in do_sendmsg - Additional verification: Verify ncookies <= MAX_.., verify ret == ncookies * sizeof(uint32_t) A brief overview of this feature follows. This patch series provides support for MSG_ZERCOCOPY on a PF_RDS socket based on the APIs and infrastructure added by f214f915e7db ("tcp: enable MSG_ZEROCOPY") For single threaded rds-stress testing using rds-tcp with the ixgbe driver using 1M message sizes (-a 1M -q 1M) preliminary results show that there is a significant reduction in latency: about 90 usec with zerocopy, compared with 200 usec without zerocopy. This patchset modifies the above for zerocopy in the following manner. - if the MSG_ZEROCOPY flag is specified with rds_sendmsg(), and, - if the SO_ZEROCOPY socket option has been set on the PF_RDS socket, application pages sent down with rds_sendmsg are pinned. The pinning uses the accounting infrastructure added by a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages"). The message is unpinned when all references to the message go down to 0, and the message is freed by rds_message_purge. A multithreaded application using this infrastructure must send down a unique 32 bit cookie as ancillary data with each sendmsg invocation. The format of this ancillary data is described in Patch 5 of the series. The cookie is passed up to the application on the sk_error_queue when the message is unpinned, indicating to the application that it is now safe to free/reuse the message buffer. The details of the completion notifiction are provided in Patch 4 of this series. Sowmini Varadhan (7): skbuff: export mm_[un]account_pinned_pages for other modules rds: hold a sock ref from rds_message to the rds_sock sock: permit SO_ZEROCOPY on PF_RDS socket rds: support for zcopy completion notification rds: zerocopy Tx support. selftests/net: add support for PF_RDS sockets selftests/net: add zerocopy support for PF_RDS test case include/linux/skbuff.h | 3 + include/uapi/linux/errqueue.h | 2 + include/uapi/linux/rds.h | 1 + net/core/skbuff.c | 6 +- net/core/sock.c | 25 +++--- net/rds/af_rds.c | 2 + net/rds/message.c | 132 ++++++++++++++++++++++++++- net/rds/rds.h | 17 ++++- net/rds/recv.c | 2 + net/rds/send.c | 51 ++++++++--- tools/testing/selftests/net/msg_zerocopy.c | 133 ++++++++++++++++++++++++++- 11 files changed, 339 insertions(+), 35 deletions(-)