From: Stefano Stabellini <sstabellini@kernel.org> To: xen-devel@lists.xen.org Cc: linux-kernel@vger.kernel.org, sstabellini@kernel.org, jgross@suse.com, boris.ostrovsky@oracle.com, Stefano Stabellini <stefano@aporeto.com> Subject: [PATCH v8 09/13] xen/pvcalls: implement sendmsg Date: Mon, 30 Oct 2017 15:40:59 -0700 [thread overview] Message-ID: <1509403263-15414-9-git-send-email-sstabellini@kernel.org> (raw) In-Reply-To: <1509403263-15414-1-git-send-email-sstabellini@kernel.org> Send data to an active socket by copying data to the "out" ring. Take the active socket out_mutex so that only one function can access the ring at any given time. If not enough room is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance significantly. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com --- drivers/xen/pvcalls-front.c | 121 ++++++++++++++++++++++++++++++++++++++++++++ drivers/xen/pvcalls-front.h | 3 ++ 2 files changed, 124 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index d781ac4..7672578 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -29,6 +29,7 @@ #define PVCALLS_INVALID_ID UINT_MAX #define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER #define PVCALLS_NR_RSP_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE) +#define PVCALLS_FRONT_MAX_SPIN 5000 struct pvcalls_bedata { struct xen_pvcalls_front_ring ring; @@ -99,6 +100,23 @@ static inline int get_request(struct pvcalls_bedata *bedata, int *req_id) return 0; } +static bool pvcalls_front_write_todo(struct sock_mapping *map) +{ + struct pvcalls_data_intf *intf = map->active.ring; + RING_IDX cons, prod, size = XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER); + int32_t error; + + error = intf->out_error; + if (error == -ENOTCONN) + return false; + if (error != 0) + return true; + + cons = intf->out_cons; + prod = intf->out_prod; + return !!(size - pvcalls_queued(prod, cons, size)); +} + static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) { struct xenbus_device *dev = dev_id; @@ -363,6 +381,109 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, return ret; } +static int __write_ring(struct pvcalls_data_intf *intf, + struct pvcalls_data *data, + struct iov_iter *msg_iter, + int len) +{ + RING_IDX cons, prod, size, masked_prod, masked_cons; + RING_IDX array_size = XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER); + int32_t error; + + error = intf->out_error; + if (error < 0) + return error; + cons = intf->out_cons; + prod = intf->out_prod; + /* read indexes before continuing */ + virt_mb(); + + size = pvcalls_queued(prod, cons, array_size); + if (size >= array_size) + return -EINVAL; + if (len > array_size - size) + len = array_size - size; + + masked_prod = pvcalls_mask(prod, array_size); + masked_cons = pvcalls_mask(cons, array_size); + + if (masked_prod < masked_cons) { + len = copy_from_iter(data->out + masked_prod, len, msg_iter); + } else { + if (len > array_size - masked_prod) { + int ret = copy_from_iter(data->out + masked_prod, + array_size - masked_prod, msg_iter); + if (ret != array_size - masked_prod) { + len = ret; + goto out; + } + len = ret + copy_from_iter(data->out, len - ret, msg_iter); + } else { + len = copy_from_iter(data->out + masked_prod, len, msg_iter); + } + } +out: + /* write to ring before updating pointer */ + virt_wmb(); + intf->out_prod += len; + + return len; +} + +int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg, + size_t len) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map; + int sent, tot_sent = 0; + int count = 0, flags; + + flags = msg->msg_flags; + if (flags & (MSG_CONFIRM|MSG_DONTROUTE|MSG_EOR|MSG_OOB)) + return -EOPNOTSUPP; + + pvcalls_enter(); + if (!pvcalls_front_dev) { + pvcalls_exit(); + return -ENOTCONN; + } + bedata = dev_get_drvdata(&pvcalls_front_dev->dev); + + map = (struct sock_mapping *) sock->sk->sk_send_head; + if (!map) { + pvcalls_exit(); + return -ENOTSOCK; + } + + mutex_lock(&map->active.out_mutex); + if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) { + mutex_unlock(&map->active.out_mutex); + pvcalls_exit(); + return -EAGAIN; + } + if (len > INT_MAX) + len = INT_MAX; + +again: + count++; + sent = __write_ring(map->active.ring, + &map->active.data, &msg->msg_iter, + len); + if (sent > 0) { + len -= sent; + tot_sent += sent; + notify_remote_via_irq(map->active.irq); + } + if (sent >= 0 && len > 0 && count < PVCALLS_FRONT_MAX_SPIN) + goto again; + if (sent < 0) + tot_sent = sent; + + mutex_unlock(&map->active.out_mutex); + pvcalls_exit(); + return tot_sent; +} + int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct pvcalls_bedata *bedata; diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h index ab4f1da..d937c24 100644 --- a/drivers/xen/pvcalls-front.h +++ b/drivers/xen/pvcalls-front.h @@ -13,5 +13,8 @@ int pvcalls_front_bind(struct socket *sock, int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int flags); +int pvcalls_front_sendmsg(struct socket *sock, + struct msghdr *msg, + size_t len); #endif -- 1.9.1
WARNING: multiple messages have this Message-ID (diff)
From: Stefano Stabellini <sstabellini@kernel.org> To: xen-devel@lists.xen.org Cc: jgross@suse.com, Stefano Stabellini <stefano@aporeto.com>, boris.ostrovsky@oracle.com, sstabellini@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 09/13] xen/pvcalls: implement sendmsg Date: Mon, 30 Oct 2017 15:40:59 -0700 [thread overview] Message-ID: <1509403263-15414-9-git-send-email-sstabellini@kernel.org> (raw) In-Reply-To: <1509403263-15414-1-git-send-email-sstabellini@kernel.org> Send data to an active socket by copying data to the "out" ring. Take the active socket out_mutex so that only one function can access the ring at any given time. If not enough room is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance significantly. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com --- drivers/xen/pvcalls-front.c | 121 ++++++++++++++++++++++++++++++++++++++++++++ drivers/xen/pvcalls-front.h | 3 ++ 2 files changed, 124 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index d781ac4..7672578 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -29,6 +29,7 @@ #define PVCALLS_INVALID_ID UINT_MAX #define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER #define PVCALLS_NR_RSP_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE) +#define PVCALLS_FRONT_MAX_SPIN 5000 struct pvcalls_bedata { struct xen_pvcalls_front_ring ring; @@ -99,6 +100,23 @@ static inline int get_request(struct pvcalls_bedata *bedata, int *req_id) return 0; } +static bool pvcalls_front_write_todo(struct sock_mapping *map) +{ + struct pvcalls_data_intf *intf = map->active.ring; + RING_IDX cons, prod, size = XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER); + int32_t error; + + error = intf->out_error; + if (error == -ENOTCONN) + return false; + if (error != 0) + return true; + + cons = intf->out_cons; + prod = intf->out_prod; + return !!(size - pvcalls_queued(prod, cons, size)); +} + static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) { struct xenbus_device *dev = dev_id; @@ -363,6 +381,109 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, return ret; } +static int __write_ring(struct pvcalls_data_intf *intf, + struct pvcalls_data *data, + struct iov_iter *msg_iter, + int len) +{ + RING_IDX cons, prod, size, masked_prod, masked_cons; + RING_IDX array_size = XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER); + int32_t error; + + error = intf->out_error; + if (error < 0) + return error; + cons = intf->out_cons; + prod = intf->out_prod; + /* read indexes before continuing */ + virt_mb(); + + size = pvcalls_queued(prod, cons, array_size); + if (size >= array_size) + return -EINVAL; + if (len > array_size - size) + len = array_size - size; + + masked_prod = pvcalls_mask(prod, array_size); + masked_cons = pvcalls_mask(cons, array_size); + + if (masked_prod < masked_cons) { + len = copy_from_iter(data->out + masked_prod, len, msg_iter); + } else { + if (len > array_size - masked_prod) { + int ret = copy_from_iter(data->out + masked_prod, + array_size - masked_prod, msg_iter); + if (ret != array_size - masked_prod) { + len = ret; + goto out; + } + len = ret + copy_from_iter(data->out, len - ret, msg_iter); + } else { + len = copy_from_iter(data->out + masked_prod, len, msg_iter); + } + } +out: + /* write to ring before updating pointer */ + virt_wmb(); + intf->out_prod += len; + + return len; +} + +int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg, + size_t len) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map; + int sent, tot_sent = 0; + int count = 0, flags; + + flags = msg->msg_flags; + if (flags & (MSG_CONFIRM|MSG_DONTROUTE|MSG_EOR|MSG_OOB)) + return -EOPNOTSUPP; + + pvcalls_enter(); + if (!pvcalls_front_dev) { + pvcalls_exit(); + return -ENOTCONN; + } + bedata = dev_get_drvdata(&pvcalls_front_dev->dev); + + map = (struct sock_mapping *) sock->sk->sk_send_head; + if (!map) { + pvcalls_exit(); + return -ENOTSOCK; + } + + mutex_lock(&map->active.out_mutex); + if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) { + mutex_unlock(&map->active.out_mutex); + pvcalls_exit(); + return -EAGAIN; + } + if (len > INT_MAX) + len = INT_MAX; + +again: + count++; + sent = __write_ring(map->active.ring, + &map->active.data, &msg->msg_iter, + len); + if (sent > 0) { + len -= sent; + tot_sent += sent; + notify_remote_via_irq(map->active.irq); + } + if (sent >= 0 && len > 0 && count < PVCALLS_FRONT_MAX_SPIN) + goto again; + if (sent < 0) + tot_sent = sent; + + mutex_unlock(&map->active.out_mutex); + pvcalls_exit(); + return tot_sent; +} + int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct pvcalls_bedata *bedata; diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h index ab4f1da..d937c24 100644 --- a/drivers/xen/pvcalls-front.h +++ b/drivers/xen/pvcalls-front.h @@ -13,5 +13,8 @@ int pvcalls_front_bind(struct socket *sock, int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int flags); +int pvcalls_front_sendmsg(struct socket *sock, + struct msghdr *msg, + size_t len); #endif -- 1.9.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-10-30 22:42 UTC|newest] Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-10-30 22:40 [PATCH v8 00/13] introduce the Xen PV Calls frontend Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 01/13] xen/pvcalls: introduce the pvcalls xenbus frontend Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 02/13] xen/pvcalls: implement frontend disconnect Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 03/13] xen/pvcalls: connect to the backend Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 04/13] xen/pvcalls: implement socket command and handle events Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 05/13] xen/pvcalls: implement connect command Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 06/13] xen/pvcalls: implement bind command Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 07/13] xen/pvcalls: implement listen command Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` [PATCH v8 08/13] xen/pvcalls: implement accept command Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini 2017-10-30 22:40 ` Stefano Stabellini [this message] 2017-10-30 22:40 ` [PATCH v8 09/13] xen/pvcalls: implement sendmsg Stefano Stabellini 2017-10-30 22:41 ` [PATCH v8 10/13] xen/pvcalls: implement recvmsg Stefano Stabellini 2017-10-30 22:41 ` Stefano Stabellini 2017-10-30 22:41 ` [PATCH v8 11/13] xen/pvcalls: implement poll command Stefano Stabellini 2017-10-30 22:41 ` Stefano Stabellini 2017-10-30 22:41 ` [PATCH v8 12/13] xen/pvcalls: implement release command Stefano Stabellini 2017-10-30 22:41 ` Stefano Stabellini 2017-10-30 22:41 ` [PATCH v8 13/13] xen: introduce a Kconfig option to enable the pvcalls frontend Stefano Stabellini 2017-10-30 22:41 ` Stefano Stabellini 2017-10-31 17:50 ` [PATCH v8 00/13] introduce the Xen PV Calls frontend Boris Ostrovsky 2017-10-31 17:50 ` Boris Ostrovsky
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1509403263-15414-9-git-send-email-sstabellini@kernel.org \ --to=sstabellini@kernel.org \ --cc=boris.ostrovsky@oracle.com \ --cc=jgross@suse.com \ --cc=linux-kernel@vger.kernel.org \ --cc=stefano@aporeto.com \ --cc=xen-devel@lists.xen.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.