All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] sendmmsg fixes
@ 2011-08-05  0:07 Anton Blanchard
  2011-08-05  0:07 ` [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent Anton Blanchard
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Anton Blanchard @ 2011-08-05  0:07 UTC (permalink / raw)
  To: penguin-kernel, davem, eparis, casey, mjt; +Cc: netdev, linux-security-module

Here are the current set of sendmmsg fixes that pass my test
cases. Any review would be much appreciated.

Anton


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent
  2011-08-05  0:07 [PATCH 0/3] sendmmsg fixes Anton Blanchard
@ 2011-08-05  0:07 ` Anton Blanchard
  2011-08-05  3:57   ` Tetsuo Handa
  2011-08-05  0:07 ` [PATCH 2/3] net: Cap number of elements for sendmmsg Anton Blanchard
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Anton Blanchard @ 2011-08-05  0:07 UTC (permalink / raw)
  To: penguin-kernel, davem, eparis, casey, mjt; +Cc: netdev, linux-security-module

[-- Attachment #1: sendmmsg_errno --]
[-- Type: text/plain, Size: 2002 bytes --]

sendmmsg uses a similar error return strategy as recvmmsg but it
turns out to be a confusing way to communicate errors.

The current code stores the error code away and returns it on the next
sendmmsg call. This means a call with completely valid arguments could
get an error from a previous call.

Change things so we only return an error if no datagrams could be sent.
If less than the requested number of messages were sent, the application
must retry starting at the first failed one and if the problem is
persistent the error will be returned.

This matches the behaviour of other syscalls like read/write - it
is not an error if less than the requested number of elements are sent.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable <stable@kernel.org> [3.0+]
---

Index: linux-net/net/socket.c
===================================================================
--- linux-net.orig/net/socket.c	2011-08-04 20:37:53.959996025 +1000
+++ linux-net/net/socket.c	2011-08-05 09:31:11.724704078 +1000
@@ -2005,12 +2005,9 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	if (!sock)
 		return err;
 
-	err = sock_error(sock->sk);
-	if (err)
-		goto out_put;
-
 	entry = mmsg;
 	compat_entry = (struct compat_mmsghdr __user *)mmsg;
+	err = 0;
 
 	while (datagrams < vlen) {
 		/*
@@ -2037,29 +2034,11 @@ int __sys_sendmmsg(int fd, struct mmsghd
 		++datagrams;
 	}
 
-out_put:
 	fput_light(sock->file, fput_needed);
 
-	if (err == 0)
-		return datagrams;
-
-	if (datagrams != 0) {
-		/*
-		 * We may send less entries than requested (vlen) if the
-		 * sock is non blocking...
-		 */
-		if (err != -EAGAIN) {
-			/*
-			 * ... or if sendmsg returns an error after we
-			 * send some datagrams, where we record the
-			 * error to return on the next call or if the
-			 * app asks about it using getsockopt(SO_ERROR).
-			 */
-			sock->sk->sk_err = -err;
-		}
-
+	/* We only return an error if no datagrams were able to be sent */
+	if (datagrams != 0)
 		return datagrams;
-	}
 
 	return err;
 }



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/3] net: Cap number of elements for sendmmsg
  2011-08-05  0:07 [PATCH 0/3] sendmmsg fixes Anton Blanchard
  2011-08-05  0:07 ` [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent Anton Blanchard
@ 2011-08-05  0:07 ` Anton Blanchard
  2011-08-05  4:29   ` Tetsuo Handa
  2011-08-05  0:07 ` [PATCH 3/3] net: Fix security_socket_sendmsg() bypass problem Anton Blanchard
  2011-08-05 10:31 ` [PATCH 0/3] sendmmsg fixes David Miller
  3 siblings, 1 reply; 12+ messages in thread
From: Anton Blanchard @ 2011-08-05  0:07 UTC (permalink / raw)
  To: penguin-kernel, davem, eparis, casey, mjt; +Cc: netdev, linux-security-module

[-- Attachment #1: cap_sendmmsg --]
[-- Type: text/plain, Size: 871 bytes --]

To limit the amount of time we can spend in sendmmsg, cap the
number of elements to UIO_MAXIOV (currently 1024). 

For error handling an application using sendmmsg needs to retry at
the first unsent message, so capping is simpler and requires less
application logic than returning EINVAL.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable <stable@kernel.org> [3.0+]
---

Index: linux-net/net/socket.c
===================================================================
--- linux-net.orig/net/socket.c	2011-08-04 20:16:58.667081839 +1000
+++ linux-net/net/socket.c	2011-08-04 20:37:32.399602006 +1000
@@ -1999,6 +1999,9 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	struct compat_mmsghdr __user *compat_entry;
 	struct msghdr msg_sys;
 
+	if (vlen > UIO_MAXIOV)
+		vlen = UIO_MAXIOV;
+
 	datagrams = 0;
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] net: Fix security_socket_sendmsg() bypass problem.
  2011-08-05  0:07 [PATCH 0/3] sendmmsg fixes Anton Blanchard
  2011-08-05  0:07 ` [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent Anton Blanchard
  2011-08-05  0:07 ` [PATCH 2/3] net: Cap number of elements for sendmmsg Anton Blanchard
@ 2011-08-05  0:07 ` Anton Blanchard
  2011-08-05  0:38   ` Casey Schaufler
  2011-08-05 10:31 ` [PATCH 0/3] sendmmsg fixes David Miller
  3 siblings, 1 reply; 12+ messages in thread
From: Anton Blanchard @ 2011-08-05  0:07 UTC (permalink / raw)
  To: penguin-kernel, davem, eparis, casey, mjt; +Cc: netdev, linux-security-module

[-- Attachment #1: sendmmsg_v2 --]
[-- Type: text/plain, Size: 4180 bytes --]

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

The sendmmsg() introduced by commit 228e548e "net: Add sendmmsg socket system
call" is capable of sending to multiple different destination addresses.

SMACK is using destination's address for checking sendmsg() permission.
However, security_socket_sendmsg() is called for only once even if multiple
different destination addresses are passed to sendmmsg().

Therefore, we need to call security_socket_sendmsg() for each destination
address rather than only the first destination address.

Since calling security_socket_sendmsg() every time when only single destination
address was passed to sendmmsg() is a waste of time, omit calling
security_socket_sendmsg() unless destination address of previous datagram and
that of current datagram differs.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-off: Anton Blanchard <anton@samba.org>
Cc: stable <stable@kernel.org> [3.0+]
---

Index: linux-net/net/socket.c
===================================================================
--- linux-net.orig/net/socket.c	2011-08-05 09:31:27.000000000 +1000
+++ linux-net/net/socket.c	2011-08-05 09:32:46.146436405 +1000
@@ -1871,8 +1871,14 @@ SYSCALL_DEFINE2(shutdown, int, fd, int,
 #define COMPAT_NAMELEN(msg)	COMPAT_MSG(msg, msg_namelen)
 #define COMPAT_FLAGS(msg)	COMPAT_MSG(msg, msg_flags)
 
+struct used_address {
+	struct sockaddr_storage name;
+	unsigned int name_len;
+};
+
 static int __sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
-			 struct msghdr *msg_sys, unsigned flags, int nosec)
+			 struct msghdr *msg_sys, unsigned flags,
+			 struct used_address *used_address)
 {
 	struct compat_msghdr __user *msg_compat =
 	    (struct compat_msghdr __user *)msg;
@@ -1953,8 +1959,28 @@ static int __sys_sendmsg(struct socket *
 
 	if (sock->file->f_flags & O_NONBLOCK)
 		msg_sys->msg_flags |= MSG_DONTWAIT;
-	err = (nosec ? sock_sendmsg_nosec : sock_sendmsg)(sock, msg_sys,
-							  total_len);
+	/*
+	 * If this is sendmmsg() and current destination address is same as
+	 * previously succeeded address, omit asking LSM's decision.
+	 * used_address->name_len is initialized to UINT_MAX so that the first
+	 * destination address never matches.
+	 */
+	if (used_address && used_address->name_len == msg_sys->msg_namelen &&
+	    !memcmp(&used_address->name, msg->msg_name,
+		    used_address->name_len)) {
+		err = sock_sendmsg_nosec(sock, msg_sys, total_len);
+		goto out_freectl;
+	}
+	err = sock_sendmsg(sock, msg_sys, total_len);
+	/*
+	 * If this is sendmmsg() and sending to current destination address was
+	 * successful, remember it.
+	 */
+	if (used_address && err >= 0) {
+		used_address->name_len = msg_sys->msg_namelen;
+		memcpy(&used_address->name, msg->msg_name,
+		       used_address->name_len);
+	}
 
 out_freectl:
 	if (ctl_buf != ctl)
@@ -1979,7 +2005,7 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct
 	if (!sock)
 		goto out;
 
-	err = __sys_sendmsg(sock, msg, &msg_sys, flags, 0);
+	err = __sys_sendmsg(sock, msg, &msg_sys, flags, NULL);
 
 	fput_light(sock->file, fput_needed);
 out:
@@ -1998,6 +2024,7 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	struct mmsghdr __user *entry;
 	struct compat_mmsghdr __user *compat_entry;
 	struct msghdr msg_sys;
+	struct used_address used_address;
 
 	if (vlen > UIO_MAXIOV)
 		vlen = UIO_MAXIOV;
@@ -2008,24 +2035,22 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	if (!sock)
 		return err;
 
+	used_address.name_len = UINT_MAX;
 	entry = mmsg;
 	compat_entry = (struct compat_mmsghdr __user *)mmsg;
 	err = 0;
 
 	while (datagrams < vlen) {
-		/*
-		 * No need to ask LSM for more than the first datagram.
-		 */
 		if (MSG_CMSG_COMPAT & flags) {
 			err = __sys_sendmsg(sock, (struct msghdr __user *)compat_entry,
-					    &msg_sys, flags, datagrams);
+					    &msg_sys, flags, &used_address);
 			if (err < 0)
 				break;
 			err = __put_user(err, &compat_entry->msg_len);
 			++compat_entry;
 		} else {
 			err = __sys_sendmsg(sock, (struct msghdr __user *)entry,
-					    &msg_sys, flags, datagrams);
+					    &msg_sys, flags, &used_address);
 			if (err < 0)
 				break;
 			err = put_user(err, &entry->msg_len);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] net: Fix security_socket_sendmsg() bypass problem.
  2011-08-05  0:07 ` [PATCH 3/3] net: Fix security_socket_sendmsg() bypass problem Anton Blanchard
@ 2011-08-05  0:38   ` Casey Schaufler
  0 siblings, 0 replies; 12+ messages in thread
From: Casey Schaufler @ 2011-08-05  0:38 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: penguin-kernel, davem, eparis, mjt, netdev,
	linux-security-module, Casey Schaufler

On 8/4/2011 5:07 PM, Anton Blanchard wrote:
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>
> The sendmmsg() introduced by commit 228e548e "net: Add sendmmsg socket system
> call" is capable of sending to multiple different destination addresses.
>
> SMACK is using destination's address for checking sendmsg() permission.
> However, security_socket_sendmsg() is called for only once even if multiple
> different destination addresses are passed to sendmmsg().
>
> Therefore, we need to call security_socket_sendmsg() for each destination
> address rather than only the first destination address.
>
> Since calling security_socket_sendmsg() every time when only single destination
> address was passed to sendmmsg() is a waste of time, omit calling
> security_socket_sendmsg() unless destination address of previous datagram and
> that of current datagram differs.

Thank you.

>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-off: Anton Blanchard <anton@samba.org>
> Cc: stable <stable@kernel.org> [3.0+]
> ---
>
> Index: linux-net/net/socket.c
> ===================================================================
> --- linux-net.orig/net/socket.c	2011-08-05 09:31:27.000000000 +1000
> +++ linux-net/net/socket.c	2011-08-05 09:32:46.146436405 +1000
> @@ -1871,8 +1871,14 @@ SYSCALL_DEFINE2(shutdown, int, fd, int,
>  #define COMPAT_NAMELEN(msg)	COMPAT_MSG(msg, msg_namelen)
>  #define COMPAT_FLAGS(msg)	COMPAT_MSG(msg, msg_flags)
>  
> +struct used_address {
> +	struct sockaddr_storage name;
> +	unsigned int name_len;
> +};
> +
>  static int __sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
> -			 struct msghdr *msg_sys, unsigned flags, int nosec)
> +			 struct msghdr *msg_sys, unsigned flags,
> +			 struct used_address *used_address)
>  {
>  	struct compat_msghdr __user *msg_compat =
>  	    (struct compat_msghdr __user *)msg;
> @@ -1953,8 +1959,28 @@ static int __sys_sendmsg(struct socket *
>  
>  	if (sock->file->f_flags & O_NONBLOCK)
>  		msg_sys->msg_flags |= MSG_DONTWAIT;
> -	err = (nosec ? sock_sendmsg_nosec : sock_sendmsg)(sock, msg_sys,
> -							  total_len);
> +	/*
> +	 * If this is sendmmsg() and current destination address is same as
> +	 * previously succeeded address, omit asking LSM's decision.
> +	 * used_address->name_len is initialized to UINT_MAX so that the first
> +	 * destination address never matches.
> +	 */
> +	if (used_address && used_address->name_len == msg_sys->msg_namelen &&
> +	    !memcmp(&used_address->name, msg->msg_name,
> +		    used_address->name_len)) {
> +		err = sock_sendmsg_nosec(sock, msg_sys, total_len);
> +		goto out_freectl;
> +	}
> +	err = sock_sendmsg(sock, msg_sys, total_len);
> +	/*
> +	 * If this is sendmmsg() and sending to current destination address was
> +	 * successful, remember it.
> +	 */
> +	if (used_address && err >= 0) {
> +		used_address->name_len = msg_sys->msg_namelen;
> +		memcpy(&used_address->name, msg->msg_name,
> +		       used_address->name_len);
> +	}
>  
>  out_freectl:
>  	if (ctl_buf != ctl)
> @@ -1979,7 +2005,7 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct
>  	if (!sock)
>  		goto out;
>  
> -	err = __sys_sendmsg(sock, msg, &msg_sys, flags, 0);
> +	err = __sys_sendmsg(sock, msg, &msg_sys, flags, NULL);
>  
>  	fput_light(sock->file, fput_needed);
>  out:
> @@ -1998,6 +2024,7 @@ int __sys_sendmmsg(int fd, struct mmsghd
>  	struct mmsghdr __user *entry;
>  	struct compat_mmsghdr __user *compat_entry;
>  	struct msghdr msg_sys;
> +	struct used_address used_address;
>  
>  	if (vlen > UIO_MAXIOV)
>  		vlen = UIO_MAXIOV;
> @@ -2008,24 +2035,22 @@ int __sys_sendmmsg(int fd, struct mmsghd
>  	if (!sock)
>  		return err;
>  
> +	used_address.name_len = UINT_MAX;
>  	entry = mmsg;
>  	compat_entry = (struct compat_mmsghdr __user *)mmsg;
>  	err = 0;
>  
>  	while (datagrams < vlen) {
> -		/*
> -		 * No need to ask LSM for more than the first datagram.
> -		 */
>  		if (MSG_CMSG_COMPAT & flags) {
>  			err = __sys_sendmsg(sock, (struct msghdr __user *)compat_entry,
> -					    &msg_sys, flags, datagrams);
> +					    &msg_sys, flags, &used_address);
>  			if (err < 0)
>  				break;
>  			err = __put_user(err, &compat_entry->msg_len);
>  			++compat_entry;
>  		} else {
>  			err = __sys_sendmsg(sock, (struct msghdr __user *)entry,
> -					    &msg_sys, flags, datagrams);
> +					    &msg_sys, flags, &used_address);
>  			if (err < 0)
>  				break;
>  			err = put_user(err, &entry->msg_len);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent
  2011-08-05  0:07 ` [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent Anton Blanchard
@ 2011-08-05  3:57   ` Tetsuo Handa
  2011-08-05  8:20     ` Steven Whitehouse
  0 siblings, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2011-08-05  3:57 UTC (permalink / raw)
  To: acme, rdenis, swhiteho; +Cc: netdev

Anton Blanchard wrote:
> sendmmsg uses a similar error return strategy as recvmmsg but it
> turns out to be a confusing way to communicate errors.
> 
> The current code stores the error code away and returns it on the next
> sendmmsg call. This means a call with completely valid arguments could
> get an error from a previous call.
> 
> Change things so we only return an error if no datagrams could be sent.
> If less than the requested number of messages were sent, the application
> must retry starting at the first failed one and if the problem is
> persistent the error will be returned.
> 
> This matches the behaviour of other syscalls like read/write - it
> is not an error if less than the requested number of elements are sent.

OK. David S. Miller suggested this behavior and Anton Blanchard agreed with
this behavior.

Quoting from commit a2e27255 "net: Introduce recvmmsg socket syscall":
| . R?mi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
|   datagrams and then recvmsg returns an error, recvmmsg will return
|   the successfully received datagrams, store the error and return it
|   in the next call.

R?mi Denis-Courmont, Steven Whitehouse and Arnaldo Carvalho de Melo, do you
want to change recvmmsg()'s behaviour as well?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] net: Cap number of elements for sendmmsg
  2011-08-05  0:07 ` [PATCH 2/3] net: Cap number of elements for sendmmsg Anton Blanchard
@ 2011-08-05  4:29   ` Tetsuo Handa
  2011-08-05  4:46     ` Anton Blanchard
  0 siblings, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2011-08-05  4:29 UTC (permalink / raw)
  To: anton, acme; +Cc: netdev, linux-security-module, davem, eparis, casey, mjt

Anton Blanchard wrote:
> To limit the amount of time we can spend in sendmmsg, cap the
> number of elements to UIO_MAXIOV (currently 1024). 
> 
> For error handling an application using sendmmsg needs to retry at
> the first unsent message, so capping is simpler and requires less
> application logic than returning EINVAL.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> Cc: stable <stable@kernel.org> [3.0+]

I think 1024 is a reasonable value.
But I also worry that programmers may wish to send more.


Apart from the upper limit for vlen argument of sendmmsg()/recvmmsg(),
we need to deal with stall problem (described below).



It may be possible to abuse sendmmsg() which was introduced in Linux 3.0 and
recvmmsg() which was introduced in Linux 2.6.33 for triggering CPU stall
warning.

I ran below program

---------- Test program start ----------
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <netdb.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <asm/unistd.h>
#include <errno.h>

#ifndef __NR_sendmmsg
#if defined( __PPC__)
#define __NR_sendmmsg	349
#elif defined(__x86_64__)
#define __NR_sendmmsg	307
#elif defined(__i386__)
#define __NR_sendmmsg	345
#else
#error __NR_sendmmsg not defined
#endif
#endif

struct mmsghdr {
	struct msghdr msg_hdr;
	unsigned int msg_len;
};

static inline int sendmmsg(int fd, struct mmsghdr *mmsg, unsigned vlen,
			   unsigned flags)
{
	return syscall(__NR_sendmmsg, fd, mmsg, vlen, flags, NULL);
}

#define NUMBATCH (1048576 * 64)

int main(int argc, char *argv[])
{
	const int fd = socket(AF_INET, SOCK_DGRAM, 0);
	struct mmsghdr *datagrams;
	unsigned int i;
	struct iovec iovec = { };
	struct sockaddr_in addr = { };
	addr.sin_family = AF_INET;
	addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
	addr.sin_port = htons(10000);
	datagrams = calloc(sizeof(*datagrams), NUMBATCH);
	for (i = 0; i < NUMBATCH; ++i) {
		datagrams[i].msg_hdr.msg_iov = &iovec;
		datagrams[i].msg_hdr.msg_iovlen = 1;
		datagrams[i].msg_hdr.msg_name = &addr;
		datagrams[i].msg_hdr.msg_namelen = sizeof(addr);
	}
	printf("Calling sendmmsg()\n");
	printf("%d\n", sendmmsg(fd, datagrams, NUMBATCH, 0));
	printf("Done\n");
	return 0;
}
---------- Test program end ----------

and got below output.

# time ./a.out
Calling sendmmsg()
INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
67108864
Done

real    2m48.736s
user    0m0.128s
sys     0m16.489s



If this application created threads that matches number of CPUs available, and
entered into sendmmsg(), the machine will hang for many seconds.

Also, signals are ignored when this application is in sendmmsg(). That is,
if this application is holding much RAM (like above program) and is selected by
OOM-killer, this application cannot be killed until returns from sendmmsg().

Applying below patch solves the stall message and signal delaying problem.
----------------------------------------
[PATCH] net: Fix sendmmsg() stall problem.

If the caller passed a huge value (e.g. 64M) to vlen argument of sendmmsg(),
the caller triggers "INFO: rcu_sched_state detected stall on CPU X" message
because there is no chance to call scheduler. Thus give a chance to call
scheduler and also check for pending signal.

Also, if the caller passed a value where IS_ERR() returns true (e.g. UINT_MAX),
the caller will get EOF and errno will be set to (e.g.) EPERM when all
datagrams are successfully sent. Thus, limit the max value of vlen to INT_MAX.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.0+]
---
 net/socket.c |    5 +++++
 1 file changed, 5 insertions(+)

--- linux-3.0.orig/net/socket.c
+++ linux-3.0/net/socket.c
@@ -1999,6 +1999,8 @@ int __sys_sendmmsg(int fd, struct mmsghd
 	struct compat_mmsghdr __user *compat_entry;
 	struct msghdr msg_sys;
 
+	if ((int) vlen < 0)
+		return -EINVAL;
 	datagrams = 0;
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
@@ -2035,6 +2037,9 @@ int __sys_sendmmsg(int fd, struct mmsghd
 		if (err)
 			break;
 		++datagrams;
+		cond_resched();
+		if (signal_pending(current))
+			break;
 	}
 
 out_put:
----------------------------------------

The situation is similar regaring recvmmsg(). Although recvmmsg() less likely
stalls than sendmmsg() does, it could happen if a huge number of datagrams are
in the socket's receive queue and the caller attempted to fetch all of them at
once. Thus, we may want below patch as well.

----------------------------------------
[PATCH] net: Fix recvmmsg() stall problem.

If the caller passed a huge value to vlen argument of recvmmsg() and there are
enough datagrams in the socket's receive queue, trying to pick up all at once
may trigger "INFO: rcu_sched_state detected stall on CPU X" message because
there is no chance to call scheduler. Thus give a chance to call scheduler and
also check for pending signal.

Also, if the caller passed a value where IS_ERR() returns true (e.g. UINT_MAX),
the caller will get EOF and errno will be set to (e.g.) EPERM when all
datagrams are successfully received. Thus, limit the max value of vlen to
INT_MAX.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [2.6.33+]
---
 net/socket.c |    5 +++++
 1 file changed, 5 insertions(+)

--- linux-3.0.orig/net/socket.c
+++ linux-3.0/net/socket.c
@@ -2204,6 +2204,8 @@ int __sys_recvmmsg(int fd, struct mmsghd
 	struct msghdr msg_sys;
 	struct timespec end_time;
 
+	if ((int) vlen < 0)
+		return -EINVAL;
 	if (timeout &&
 	    poll_select_set_timeout(&end_time, timeout->tv_sec,
 				    timeout->tv_nsec))
@@ -2247,6 +2249,9 @@ int __sys_recvmmsg(int fd, struct mmsghd
 		if (err)
 			break;
 		++datagrams;
+		cond_resched();
+		if (signal_pending(current))
+			break;
 
 		/* MSG_WAITFORONE turns on MSG_DONTWAIT after one packet */
 		if (flags & MSG_WAITFORONE)
----------------------------------------

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] net: Cap number of elements for sendmmsg
  2011-08-05  4:29   ` Tetsuo Handa
@ 2011-08-05  4:46     ` Anton Blanchard
  2011-08-05  5:50       ` Tetsuo Handa
  0 siblings, 1 reply; 12+ messages in thread
From: Anton Blanchard @ 2011-08-05  4:46 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: acme, netdev, linux-security-module, davem, eparis, casey, mjt


Hi,

> I think 1024 is a reasonable value.
> But I also worry that programmers may wish to send more.
> 
> 
> Apart from the upper limit for vlen argument of sendmmsg()/recvmmsg(),
> we need to deal with stall problem (described below).

Capping vlen at 1024 should prevent that. Your patch does a signed
comparison which just reduces the maximum value by 1 bit doesn't it?

Keep in mind each element could have up to 1024 iovec entries at worst
case, so I think 1024 is a sane upper max.

Anton

> It may be possible to abuse sendmmsg() which was introduced in Linux
> 3.0 and recvmmsg() which was introduced in Linux 2.6.33 for
> triggering CPU stall warning.
> 
> I ran below program
> 
> ---------- Test program start ----------
> #include <string.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <netdb.h>
> #include <sys/types.h>
> #include <sys/ioctl.h>
> #include <sys/socket.h>
> #include <asm/unistd.h>
> #include <errno.h>
> 
> #ifndef __NR_sendmmsg
> #if defined( __PPC__)
> #define __NR_sendmmsg	349
> #elif defined(__x86_64__)
> #define __NR_sendmmsg	307
> #elif defined(__i386__)
> #define __NR_sendmmsg	345
> #else
> #error __NR_sendmmsg not defined
> #endif
> #endif
> 
> struct mmsghdr {
> 	struct msghdr msg_hdr;
> 	unsigned int msg_len;
> };
> 
> static inline int sendmmsg(int fd, struct mmsghdr *mmsg, unsigned
> vlen, unsigned flags)
> {
> 	return syscall(__NR_sendmmsg, fd, mmsg, vlen, flags, NULL);
> }
> 
> #define NUMBATCH (1048576 * 64)
> 
> int main(int argc, char *argv[])
> {
> 	const int fd = socket(AF_INET, SOCK_DGRAM, 0);
> 	struct mmsghdr *datagrams;
> 	unsigned int i;
> 	struct iovec iovec = { };
> 	struct sockaddr_in addr = { };
> 	addr.sin_family = AF_INET;
> 	addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
> 	addr.sin_port = htons(10000);
> 	datagrams = calloc(sizeof(*datagrams), NUMBATCH);
> 	for (i = 0; i < NUMBATCH; ++i) {
> 		datagrams[i].msg_hdr.msg_iov = &iovec;
> 		datagrams[i].msg_hdr.msg_iovlen = 1;
> 		datagrams[i].msg_hdr.msg_name = &addr;
> 		datagrams[i].msg_hdr.msg_namelen = sizeof(addr);
> 	}
> 	printf("Calling sendmmsg()\n");
> 	printf("%d\n", sendmmsg(fd, datagrams, NUMBATCH, 0));
> 	printf("Done\n");
> 	return 0;
> }
> ---------- Test program end ----------
> 
> and got below output.
> 
> # time ./a.out
> Calling sendmmsg()
> INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
> 67108864
> Done
> 
> real    2m48.736s
> user    0m0.128s
> sys     0m16.489s
> 
> 
> 
> If this application created threads that matches number of CPUs
> available, and entered into sendmmsg(), the machine will hang for
> many seconds.
> 
> Also, signals are ignored when this application is in sendmmsg().
> That is, if this application is holding much RAM (like above program)
> and is selected by OOM-killer, this application cannot be killed
> until returns from sendmmsg().
> 
> Applying below patch solves the stall message and signal delaying
> problem. ----------------------------------------
> [PATCH] net: Fix sendmmsg() stall problem.
> 
> If the caller passed a huge value (e.g. 64M) to vlen argument of
> sendmmsg(), the caller triggers "INFO: rcu_sched_state detected stall
> on CPU X" message because there is no chance to call scheduler. Thus
> give a chance to call scheduler and also check for pending signal.
> 
> Also, if the caller passed a value where IS_ERR() returns true (e.g.
> UINT_MAX), the caller will get EOF and errno will be set to (e.g.)
> EPERM when all datagrams are successfully sent. Thus, limit the max
> value of vlen to INT_MAX.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: stable <stable@kernel.org> [3.0+]
> ---
>  net/socket.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> --- linux-3.0.orig/net/socket.c
> +++ linux-3.0/net/socket.c
> @@ -1999,6 +1999,8 @@ int __sys_sendmmsg(int fd, struct mmsghd
>  	struct compat_mmsghdr __user *compat_entry;
>  	struct msghdr msg_sys;
>  
> +	if ((int) vlen < 0)
> +		return -EINVAL;
>  	datagrams = 0;
>  
>  	sock = sockfd_lookup_light(fd, &err, &fput_needed);
> @@ -2035,6 +2037,9 @@ int __sys_sendmmsg(int fd, struct mmsghd
>  		if (err)
>  			break;
>  		++datagrams;
> +		cond_resched();
> +		if (signal_pending(current))
> +			break;
>  	}
>  
>  out_put:
> ----------------------------------------
> 
> The situation is similar regaring recvmmsg(). Although recvmmsg()
> less likely stalls than sendmmsg() does, it could happen if a huge
> number of datagrams are in the socket's receive queue and the caller
> attempted to fetch all of them at once. Thus, we may want below patch
> as well.
> 
> ----------------------------------------
> [PATCH] net: Fix recvmmsg() stall problem.
> 
> If the caller passed a huge value to vlen argument of recvmmsg() and
> there are enough datagrams in the socket's receive queue, trying to
> pick up all at once may trigger "INFO: rcu_sched_state detected stall
> on CPU X" message because there is no chance to call scheduler. Thus
> give a chance to call scheduler and also check for pending signal.
> 
> Also, if the caller passed a value where IS_ERR() returns true (e.g.
> UINT_MAX), the caller will get EOF and errno will be set to (e.g.)
> EPERM when all datagrams are successfully received. Thus, limit the
> max value of vlen to INT_MAX.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: stable <stable@kernel.org> [2.6.33+]
> ---
>  net/socket.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> --- linux-3.0.orig/net/socket.c
> +++ linux-3.0/net/socket.c
> @@ -2204,6 +2204,8 @@ int __sys_recvmmsg(int fd, struct mmsghd
>  	struct msghdr msg_sys;
>  	struct timespec end_time;
>  
> +	if ((int) vlen < 0)
> +		return -EINVAL;
>  	if (timeout &&
>  	    poll_select_set_timeout(&end_time, timeout->tv_sec,
>  				    timeout->tv_nsec))
> @@ -2247,6 +2249,9 @@ int __sys_recvmmsg(int fd, struct mmsghd
>  		if (err)
>  			break;
>  		++datagrams;
> +		cond_resched();
> +		if (signal_pending(current))
> +			break;
>  
>  		/* MSG_WAITFORONE turns on MSG_DONTWAIT after one
> packet */ if (flags & MSG_WAITFORONE)
> ----------------------------------------
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] net: Cap number of elements for sendmmsg
  2011-08-05  4:46     ` Anton Blanchard
@ 2011-08-05  5:50       ` Tetsuo Handa
  0 siblings, 0 replies; 12+ messages in thread
From: Tetsuo Handa @ 2011-08-05  5:50 UTC (permalink / raw)
  To: anton, acme, davem; +Cc: netdev, linux-security-module, eparis, casey, mjt

Anton Blanchard wrote:
> > Apart from the upper limit for vlen argument of sendmmsg()/recvmmsg(),
> > we need to deal with stall problem (described below).
> 
> Capping vlen at 1024 should prevent that. Your patch does a signed
> comparison which just reduces the maximum value by 1 bit doesn't it?

Just for avoiding returning IS_ERR_VALUE() value upon success.

> Keep in mind each element could have up to 1024 iovec entries at worst
> case, so I think 1024 is a sane upper max.

OK. Please take Anton's version.

Arnaldo, please consider copying this change to recvmmsg() too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent
  2011-08-05  3:57   ` Tetsuo Handa
@ 2011-08-05  8:20     ` Steven Whitehouse
  2011-08-05 14:40       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Whitehouse @ 2011-08-05  8:20 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: acme, rdenis, netdev

Hi,

On Fri, 2011-08-05 at 12:57 +0900, Tetsuo Handa wrote:
> Anton Blanchard wrote:
> > sendmmsg uses a similar error return strategy as recvmmsg but it
> > turns out to be a confusing way to communicate errors.
> > 
> > The current code stores the error code away and returns it on the next
> > sendmmsg call. This means a call with completely valid arguments could
> > get an error from a previous call.
> > 
> > Change things so we only return an error if no datagrams could be sent.
> > If less than the requested number of messages were sent, the application
> > must retry starting at the first failed one and if the problem is
> > persistent the error will be returned.
> > 
> > This matches the behaviour of other syscalls like read/write - it
> > is not an error if less than the requested number of elements are sent.
> 
> OK. David S. Miller suggested this behavior and Anton Blanchard agreed with
> this behavior.
> 
> Quoting from commit a2e27255 "net: Introduce recvmmsg socket syscall":
> | . R?mi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
> |   datagrams and then recvmsg returns an error, recvmmsg will return
> |   the successfully received datagrams, store the error and return it
> |   in the next call.
> 
> R?mi Denis-Courmont, Steven Whitehouse and Arnaldo Carvalho de Melo, do you
> want to change recvmmsg()'s behaviour as well?

Since I've joined this part way through it seems, I'm assuming that if
something was sent/received then that will be returned and the error
stored until the next call. If nothing was sent/received then the error
can be returned immediately.

That is what I'd expect to be the case, since otherwise it is impossible
to know how much has been successfully sent/received in the partial
failure case, I think. Also it means that sendmmesg/recvmmsg matches
sendmsg/recvmsg in terms of expected return values and thus the
principle of least surprise.

So if thats what is being proposed, then it sounds good to me,

Steve.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] sendmmsg fixes
  2011-08-05  0:07 [PATCH 0/3] sendmmsg fixes Anton Blanchard
                   ` (2 preceding siblings ...)
  2011-08-05  0:07 ` [PATCH 3/3] net: Fix security_socket_sendmsg() bypass problem Anton Blanchard
@ 2011-08-05 10:31 ` David Miller
  3 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2011-08-05 10:31 UTC (permalink / raw)
  To: anton; +Cc: penguin-kernel, eparis, casey, mjt, netdev, linux-security-module

From: Anton Blanchard <anton@samba.org>
Date: Fri, 05 Aug 2011 10:07:37 +1000

> Here are the current set of sendmmsg fixes that pass my test
> cases. Any review would be much appreciated.

All applied, thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent
  2011-08-05  8:20     ` Steven Whitehouse
@ 2011-08-05 14:40       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-08-05 14:40 UTC (permalink / raw)
  To: Steven Whitehouse; +Cc: Tetsuo Handa, rdenis, netdev

Em Fri, Aug 05, 2011 at 09:20:08AM +0100, Steven Whitehouse escreveu:
> On Fri, 2011-08-05 at 12:57 +0900, Tetsuo Handa wrote:
> > Anton Blanchard wrote:
> > > sendmmsg uses a similar error return strategy as recvmmsg but it
> > > turns out to be a confusing way to communicate errors.
> > > 
> > > The current code stores the error code away and returns it on the next
> > > sendmmsg call. This means a call with completely valid arguments could
> > > get an error from a previous call.
> > > 
> > > Change things so we only return an error if no datagrams could be sent.
> > > If less than the requested number of messages were sent, the application
> > > must retry starting at the first failed one and if the problem is
> > > persistent the error will be returned.
> > > 
> > > This matches the behaviour of other syscalls like read/write - it
> > > is not an error if less than the requested number of elements are sent.
> > 
> > OK. David S. Miller suggested this behavior and Anton Blanchard agreed with
> > this behavior.
> > 
> > Quoting from commit a2e27255 "net: Introduce recvmmsg socket syscall":
> > | . R?mi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
> > |   datagrams and then recvmsg returns an error, recvmmsg will return
> > |   the successfully received datagrams, store the error and return it
> > |   in the next call.
> > 
> > R?mi Denis-Courmont, Steven Whitehouse and Arnaldo Carvalho de Melo, do you
> > want to change recvmmsg()'s behaviour as well?
> 
> Since I've joined this part way through it seems, I'm assuming that if
> something was sent/received then that will be returned and the error
> stored until the next call. If nothing was sent/received then the error
> can be returned immediately.
> 
> That is what I'd expect to be the case, since otherwise it is impossible
> to know how much has been successfully sent/received in the partial
> failure case, I think. Also it means that sendmmesg/recvmmsg matches
> sendmsg/recvmsg in terms of expected return values and thus the
> principle of least surprise.
> 
> So if thats what is being proposed, then it sounds good to me,

Sounds sane to me too.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-08-05 14:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-05  0:07 [PATCH 0/3] sendmmsg fixes Anton Blanchard
2011-08-05  0:07 ` [PATCH 1/3] net: sendmmsg should only return an error if no messages were sent Anton Blanchard
2011-08-05  3:57   ` Tetsuo Handa
2011-08-05  8:20     ` Steven Whitehouse
2011-08-05 14:40       ` Arnaldo Carvalho de Melo
2011-08-05  0:07 ` [PATCH 2/3] net: Cap number of elements for sendmmsg Anton Blanchard
2011-08-05  4:29   ` Tetsuo Handa
2011-08-05  4:46     ` Anton Blanchard
2011-08-05  5:50       ` Tetsuo Handa
2011-08-05  0:07 ` [PATCH 3/3] net: Fix security_socket_sendmsg() bypass problem Anton Blanchard
2011-08-05  0:38   ` Casey Schaufler
2011-08-05 10:31 ` [PATCH 0/3] sendmmsg fixes David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.