* [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
@ 2012-10-19 9:55 Pavel Emelyanov
2012-10-19 10:34 ` Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Pavel Emelyanov @ 2012-10-19 9:55 UTC (permalink / raw)
To: Linux Netdev List, David Miller
The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
cannot be get via sockopt API. The only way we can find the device id a
socket is bound to is via sock-diag interface. But the diag works only on
hashed sockets, while the opt in question can be set for yet unhashed one.
That said, in order to know what device a socket is bound to (we do want
to know this in checkpoint-restore project) I propose to make this option
getsockopt-able and report the respective device index.
Another solution to the problem might be to teach the sock-diag reporting
info on unhashed sockets. Should I go this way instead?
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
diff --git a/net/core/sock.c b/net/core/sock.c
index 8a146cf..c49412c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1074,6 +1074,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
case SO_NOFCS:
v.val = sock_flag(sk, SOCK_NOFCS);
break;
+ case SO_BINDTODEVICE:
+ v.val = sk->sk_bound_dev_if;
+ break;
default:
return -ENOPROTOOPT;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-19 9:55 [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable Pavel Emelyanov
@ 2012-10-19 10:34 ` Eric Dumazet
2012-10-22 0:43 ` David Miller
2012-10-22 18:04 ` Brian Haley
2 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2012-10-19 10:34 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: Linux Netdev List, David Miller
On Fri, 2012-10-19 at 13:55 +0400, Pavel Emelyanov wrote:
> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
> cannot be get via sockopt API. The only way we can find the device id a
> socket is bound to is via sock-diag interface. But the diag works only on
> hashed sockets, while the opt in question can be set for yet unhashed one.
>
> That said, in order to know what device a socket is bound to (we do want
> to know this in checkpoint-restore project) I propose to make this option
> getsockopt-able and report the respective device index.
>
> Another solution to the problem might be to teach the sock-diag reporting
> info on unhashed sockets. Should I go this way instead?
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 8a146cf..c49412c 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1074,6 +1074,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
> case SO_NOFCS:
> v.val = sock_flag(sk, SOCK_NOFCS);
> break;
> + case SO_BINDTODEVICE:
> + v.val = sk->sk_bound_dev_if;
> + break;
> default:
> return -ENOPROTOOPT;
> }
This looks quite reasonable to me.
Acked-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-19 9:55 [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable Pavel Emelyanov
2012-10-19 10:34 ` Eric Dumazet
@ 2012-10-22 0:43 ` David Miller
2012-10-22 10:20 ` Pavel Emelyanov
2012-10-22 18:04 ` Brian Haley
2 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2012-10-22 0:43 UTC (permalink / raw)
To: xemul; +Cc: netdev
From: Pavel Emelyanov <xemul@parallels.com>
Date: Fri, 19 Oct 2012 13:55:56 +0400
> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
> cannot be get via sockopt API. The only way we can find the device id a
> socket is bound to is via sock-diag interface. But the diag works only on
> hashed sockets, while the opt in question can be set for yet unhashed one.
>
> That said, in order to know what device a socket is bound to (we do want
> to know this in checkpoint-restore project) I propose to make this option
> getsockopt-able and report the respective device index.
>
> Another solution to the problem might be to teach the sock-diag reporting
> info on unhashed sockets. Should I go this way instead?
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Applied. I can't believe we didn't support this when the feature
was added.
I guess we figured that if the application made this setting, then
it knows and doesn't need to query it.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 0:43 ` David Miller
@ 2012-10-22 10:20 ` Pavel Emelyanov
0 siblings, 0 replies; 15+ messages in thread
From: Pavel Emelyanov @ 2012-10-22 10:20 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On 10/22/2012 04:43 AM, David Miller wrote:
> From: Pavel Emelyanov <xemul@parallels.com>
> Date: Fri, 19 Oct 2012 13:55:56 +0400
>
>> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
>> cannot be get via sockopt API. The only way we can find the device id a
>> socket is bound to is via sock-diag interface. But the diag works only on
>> hashed sockets, while the opt in question can be set for yet unhashed one.
>>
>> That said, in order to know what device a socket is bound to (we do want
>> to know this in checkpoint-restore project) I propose to make this option
>> getsockopt-able and report the respective device index.
>>
>> Another solution to the problem might be to teach the sock-diag reporting
>> info on unhashed sockets. Should I go this way instead?
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> Applied. I can't believe we didn't support this when the feature
> was added.
>
> I guess we figured that if the application made this setting, then
> it knows and doesn't need to query it.
> .
AFAIS such decision (application that configures something knows what it is) was
made for many things. The same is true at least for the SO_ATTACH_FILTER option
and for the shutdown syscall. Both add something to a socket that cannot be queried
back. And I'm now thinking which way for getting one would be better.
Thanks,
Pavel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-19 9:55 [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable Pavel Emelyanov
2012-10-19 10:34 ` Eric Dumazet
2012-10-22 0:43 ` David Miller
@ 2012-10-22 18:04 ` Brian Haley
2012-10-22 20:28 ` Pavel Emelyanov
2012-10-22 20:45 ` Eric Dumazet
2 siblings, 2 replies; 15+ messages in thread
From: Brian Haley @ 2012-10-22 18:04 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: Linux Netdev List, David Miller
On 10/19/2012 05:55 AM, Pavel Emelyanov wrote:
> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
> cannot be get via sockopt API. The only way we can find the device id a
> socket is bound to is via sock-diag interface. But the diag works only on
> hashed sockets, while the opt in question can be set for yet unhashed one.
>
> That said, in order to know what device a socket is bound to (we do want
> to know this in checkpoint-restore project) I propose to make this option
> getsockopt-able and report the respective device index.
>
> Another solution to the problem might be to teach the sock-diag reporting
> info on unhashed sockets. Should I go this way instead?
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 8a146cf..c49412c 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1074,6 +1074,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
> case SO_NOFCS:
> v.val = sock_flag(sk, SOCK_NOFCS);
> break;
> + case SO_BINDTODEVICE:
> + v.val = sk->sk_bound_dev_if;
> + break;
> default:
> return -ENOPROTOOPT;
> }
Doesn't this make the set and get non-symmetrical? For example, setsockopt()
would take "eth0", but getsockopt() would return 2. The following patch would
return a string, or -ENODEV if not set.
-Brian
---
Change getsockopt(SO_BINDTODEVICE) to be symmetrical with setsockopt() by
returning the interface name as a string.
Signed-off-by: Brian Haley <brian.haley@hp.com>
diff --git a/net/core/sock.c b/net/core/sock.c
index c49412c..69b9d92 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -505,7 +505,8 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie)
}
EXPORT_SYMBOL(sk_dst_check);
-static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen)
+static int sock_setbindtodevice(struct sock *sk, char __user *optval,
+ int optlen)
{
int ret = -ENOPROTOOPT;
#ifdef CONFIG_NETDEVICES
@@ -562,6 +563,49 @@ out:
return ret;
}
+static int sock_getbindtodevice(struct sock *sk, char __user *optval,
+ int __user *optlen, int len)
+{
+ int ret = -ENOPROTOOPT;
+#ifdef CONFIG_NETDEVICES
+ struct net *net = sock_net(sk);
+ struct net_device *dev;
+ char devname[IFNAMSIZ];
+
+ ret = 0;
+ if (sk->sk_bound_dev_if == 0)
+ goto out;
+
+ ret = -EINVAL;
+ if (len < IFNAMSIZ)
+ goto out;
+ if (len > IFNAMSIZ)
+ len = IFNAMSIZ;
+
+ rcu_read_lock();
+ dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
+ if (dev)
+ strcpy(dev->name, devname);
+ rcu_read_unlock();
+ ret = -ENODEV;
+ if (!dev)
+ goto out;
+
+ ret = -EFAULT;
+ if (copy_to_user(optval, devname, len))
+ goto out;
+
+ if (put_user(len, optlen))
+ goto out;
+
+ ret = 0;
+
+out:
+#endif
+
+ return ret;
+}
+
static inline void sock_valbool_flag(struct sock *sk, int bit, int valbool)
{
if (valbool)
@@ -589,7 +633,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
*/
if (optname == SO_BINDTODEVICE)
- return sock_bindtodevice(sk, optval, optlen);
+ return sock_setbindtodevice(sk, optval, optlen);
if (optlen < sizeof(int))
return -EINVAL;
@@ -1074,9 +1118,10 @@ int sock_getsockopt(struct socket *sock, int level, int
optname,
case SO_NOFCS:
v.val = sock_flag(sk, SOCK_NOFCS);
break;
+
case SO_BINDTODEVICE:
- v.val = sk->sk_bound_dev_if;
- break;
+ return sock_getbindtodevice(sk, optval, optlen, len);
+
default:
return -ENOPROTOOPT;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 18:04 ` Brian Haley
@ 2012-10-22 20:28 ` Pavel Emelyanov
2012-10-22 21:23 ` Brian Haley
2012-10-22 20:45 ` Eric Dumazet
1 sibling, 1 reply; 15+ messages in thread
From: Pavel Emelyanov @ 2012-10-22 20:28 UTC (permalink / raw)
To: Brian Haley; +Cc: Linux Netdev List, David Miller
On 10/22/2012 10:04 PM, Brian Haley wrote:
> On 10/19/2012 05:55 AM, Pavel Emelyanov wrote:
>> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
>> cannot be get via sockopt API. The only way we can find the device id a
>> socket is bound to is via sock-diag interface. But the diag works only on
>> hashed sockets, while the opt in question can be set for yet unhashed one.
>>
>> That said, in order to know what device a socket is bound to (we do want
>> to know this in checkpoint-restore project) I propose to make this option
>> getsockopt-able and report the respective device index.
>>
>> Another solution to the problem might be to teach the sock-diag reporting
>> info on unhashed sockets. Should I go this way instead?
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>
>> ---
>>
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 8a146cf..c49412c 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -1074,6 +1074,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>> case SO_NOFCS:
>> v.val = sock_flag(sk, SOCK_NOFCS);
>> break;
>> + case SO_BINDTODEVICE:
>> + v.val = sk->sk_bound_dev_if;
>> + break;
>> default:
>> return -ENOPROTOOPT;
>> }
>
> Doesn't this make the set and get non-symmetrical? For example, setsockopt()
> would take "eth0", but getsockopt() would return 2.
It will, but since device name and index are two equal device "IDs" I assumed
it would be OK.
However, some comments inline.
> The following patch would return a string, or -ENODEV if not set.
>
> -Brian
>
> ---
>
> Change getsockopt(SO_BINDTODEVICE) to be symmetrical with setsockopt() by
> returning the interface name as a string.
>
> Signed-off-by: Brian Haley <brian.haley@hp.com>
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index c49412c..69b9d92 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -505,7 +505,8 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie)
> }
> EXPORT_SYMBOL(sk_dst_check);
>
> -static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen)
> +static int sock_setbindtodevice(struct sock *sk, char __user *optval,
> + int optlen)
> {
> int ret = -ENOPROTOOPT;
> #ifdef CONFIG_NETDEVICES
> @@ -562,6 +563,49 @@ out:
> return ret;
> }
>
> +static int sock_getbindtodevice(struct sock *sk, char __user *optval,
> + int __user *optlen, int len)
> +{
> + int ret = -ENOPROTOOPT;
> +#ifdef CONFIG_NETDEVICES
> + struct net *net = sock_net(sk);
> + struct net_device *dev;
> + char devname[IFNAMSIZ];
> +
> + ret = 0;
> + if (sk->sk_bound_dev_if == 0)
> + goto out;
It will return 0 if device is not set, thus making it impossible to detect
this situation.
> + ret = -EINVAL;
> + if (len < IFNAMSIZ)
> + goto out;
> + if (len > IFNAMSIZ)
> + len = IFNAMSIZ;
> +
> + rcu_read_lock();
> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
> + if (dev)
> + strcpy(dev->name, devname);
> + rcu_read_unlock();
> + ret = -ENODEV;
> + if (!dev)
> + goto out;
> +
> + ret = -EFAULT;
> + if (copy_to_user(optval, devname, len))
> + goto out;
> +
> + if (put_user(len, optlen))
> + goto out;
What's the point in reporting IFNAMSIZ to the userspace always, taking
into account that this constant is exported there anyway?
> + ret = 0;
> +
> +out:
> +#endif
> +
> + return ret;
> +}
> +
> static inline void sock_valbool_flag(struct sock *sk, int bit, int valbool)
> {
> if (valbool)
> @@ -589,7 +633,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
> */
>
> if (optname == SO_BINDTODEVICE)
> - return sock_bindtodevice(sk, optval, optlen);
> + return sock_setbindtodevice(sk, optval, optlen);
>
> if (optlen < sizeof(int))
> return -EINVAL;
> @@ -1074,9 +1118,10 @@ int sock_getsockopt(struct socket *sock, int level, int
> optname,
> case SO_NOFCS:
> v.val = sock_flag(sk, SOCK_NOFCS);
> break;
> +
> case SO_BINDTODEVICE:
> - v.val = sk->sk_bound_dev_if;
> - break;
> + return sock_getbindtodevice(sk, optval, optlen, len);
> +
> default:
> return -ENOPROTOOPT;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> .
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 18:04 ` Brian Haley
2012-10-22 20:28 ` Pavel Emelyanov
@ 2012-10-22 20:45 ` Eric Dumazet
2012-10-22 21:10 ` Pavel Emelyanov
2012-10-22 21:20 ` Brian Haley
1 sibling, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2012-10-22 20:45 UTC (permalink / raw)
To: Brian Haley; +Cc: Pavel Emelyanov, Linux Netdev List, David Miller
On Mon, 2012-10-22 at 14:04 -0400, Brian Haley wrote:
> + char devname[IFNAMSIZ];
> +
> + ret = 0;
> + if (sk->sk_bound_dev_if == 0)
> + goto out;
> +
> + ret = -EINVAL;
> + if (len < IFNAMSIZ)
> + goto out;
> + if (len > IFNAMSIZ)
> + len = IFNAMSIZ;
> +
> + rcu_read_lock();
> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
> + if (dev)
> + strcpy(dev->name, devname);
> + rcu_read_unlock();
> + ret = -ENODEV;
You probably meant
strcpy(devname, dev->name)
By the way, this is not really safe in case device is renamed
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 20:45 ` Eric Dumazet
@ 2012-10-22 21:10 ` Pavel Emelyanov
2012-10-22 21:20 ` Brian Haley
1 sibling, 0 replies; 15+ messages in thread
From: Pavel Emelyanov @ 2012-10-22 21:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Brian Haley, Linux Netdev List, David Miller
On 10/23/2012 12:45 AM, Eric Dumazet wrote:
> On Mon, 2012-10-22 at 14:04 -0400, Brian Haley wrote:
>
>> + char devname[IFNAMSIZ];
>> +
>> + ret = 0;
>> + if (sk->sk_bound_dev_if == 0)
>> + goto out;
>> +
>> + ret = -EINVAL;
>> + if (len < IFNAMSIZ)
>> + goto out;
>> + if (len > IFNAMSIZ)
>> + len = IFNAMSIZ;
>> +
>> + rcu_read_lock();
>> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
>> + if (dev)
>> + strcpy(dev->name, devname);
>> + rcu_read_unlock();
>> + ret = -ENODEV;
>
> You probably meant
>
> strcpy(devname, dev->name)
>
> By the way, this is not really safe in case device is renamed
Good point, actually. Getting a device name may be not very safe in terms of --
once we have the name there's no 100% guarantee, that this name corresponds to the
actual device the socket is bound to (it could be renamed after we strcpy-ed its
name). This problem doesn't exist when we get device index, as it cannot be changed.
Thanks,
Pavel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 20:45 ` Eric Dumazet
2012-10-22 21:10 ` Pavel Emelyanov
@ 2012-10-22 21:20 ` Brian Haley
2012-10-22 21:37 ` Eric Dumazet
1 sibling, 1 reply; 15+ messages in thread
From: Brian Haley @ 2012-10-22 21:20 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Pavel Emelyanov, Linux Netdev List, David Miller
On 10/22/2012 04:45 PM, Eric Dumazet wrote:
> On Mon, 2012-10-22 at 14:04 -0400, Brian Haley wrote:
>
>> + char devname[IFNAMSIZ];
>> +
>> + ret = 0;
>> + if (sk->sk_bound_dev_if == 0)
>> + goto out;
>> +
>> + ret = -EINVAL;
>> + if (len < IFNAMSIZ)
>> + goto out;
>> + if (len > IFNAMSIZ)
>> + len = IFNAMSIZ;
>> +
>> + rcu_read_lock();
>> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
>> + if (dev)
>> + strcpy(dev->name, devname);
>> + rcu_read_unlock();
>> + ret = -ENODEV;
>
> You probably meant
>
> strcpy(devname, dev->name)
Yes, that was a stupid mistake, I'll fix it.
> By the way, this is not really safe in case device is renamed
It's not much different from what's there:
setsockopt("foo");
rename foo -> bar
index = getsockopt();
if_indextoname(index) -> "bar"
I more raised the issue since you pass a 'char *' to setsockopt() but an 'int *'
to getsockopt(), I don't think any other value is non-symmetrical like this.
-Brian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 20:28 ` Pavel Emelyanov
@ 2012-10-22 21:23 ` Brian Haley
0 siblings, 0 replies; 15+ messages in thread
From: Brian Haley @ 2012-10-22 21:23 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: Linux Netdev List, David Miller
On 10/22/2012 04:28 PM, Pavel Emelyanov wrote:
> On 10/22/2012 10:04 PM, Brian Haley wrote:
>> On 10/19/2012 05:55 AM, Pavel Emelyanov wrote:
>>> The SO_BINDTODEVICE option is the only SOL_SOCKET one that can be set, but
>>> cannot be get via sockopt API. The only way we can find the device id a
>>> socket is bound to is via sock-diag interface. But the diag works only on
>>> hashed sockets, while the opt in question can be set for yet unhashed one.
>>>
>>> That said, in order to know what device a socket is bound to (we do want
>>> to know this in checkpoint-restore project) I propose to make this option
>>> getsockopt-able and report the respective device index.
>>>
>>> Another solution to the problem might be to teach the sock-diag reporting
>>> info on unhashed sockets. Should I go this way instead?
>>>
>>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>>
>>> ---
>>>
>>> diff --git a/net/core/sock.c b/net/core/sock.c
>>> index 8a146cf..c49412c 100644
>>> --- a/net/core/sock.c
>>> +++ b/net/core/sock.c
>>> @@ -1074,6 +1074,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>>> case SO_NOFCS:
>>> v.val = sock_flag(sk, SOCK_NOFCS);
>>> break;
>>> + case SO_BINDTODEVICE:
>>> + v.val = sk->sk_bound_dev_if;
>>> + break;
>>> default:
>>> return -ENOPROTOOPT;
>>> }
>>
>> Doesn't this make the set and get non-symmetrical? For example, setsockopt()
>> would take "eth0", but getsockopt() would return 2.
>
> It will, but since device name and index are two equal device "IDs" I assumed
> it would be OK.
>
> However, some comments inline.
>
>> The following patch would return a string, or -ENODEV if not set.
Sorry, my description is not quite right, this should return something like this
to be correct:
0 on success
optlen zero if interface not set
optlen > zero if set and optval filled-in
-errno on failure
>> +static int sock_getbindtodevice(struct sock *sk, char __user *optval,
>> + int __user *optlen, int len)
>> +{
>> + int ret = -ENOPROTOOPT;
>> +#ifdef CONFIG_NETDEVICES
>> + struct net *net = sock_net(sk);
>> + struct net_device *dev;
>> + char devname[IFNAMSIZ];
>> +
>> + ret = 0;
>> + if (sk->sk_bound_dev_if == 0)
>> + goto out;
>
> It will return 0 if device is not set, thus making it impossible to detect
> this situation.
See below.
>> + ret = -EINVAL;
>> + if (len < IFNAMSIZ)
>> + goto out;
>> + if (len > IFNAMSIZ)
>> + len = IFNAMSIZ;
>> +
>> + rcu_read_lock();
>> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
>> + if (dev)
>> + strcpy(dev->name, devname);
>> + rcu_read_unlock();
>> + ret = -ENODEV;
>> + if (!dev)
>> + goto out;
>> +
>> + ret = -EFAULT;
>> + if (copy_to_user(optval, devname, len))
>> + goto out;
>> +
>> + if (put_user(len, optlen))
>> + goto out;
>
> What's the point in reporting IFNAMSIZ to the userspace always, taking
> into account that this constant is exported there anyway?
I should have put an RFC on the patch :)
In the case that there is no interface, the length returned would be zero,
indicating nothing was there.
I can post another version.
-Brian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 21:20 ` Brian Haley
@ 2012-10-22 21:37 ` Eric Dumazet
2012-10-22 21:47 ` Brian Haley
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2012-10-22 21:37 UTC (permalink / raw)
To: Brian Haley; +Cc: Pavel Emelyanov, Linux Netdev List, David Miller
On Mon, 2012-10-22 at 17:20 -0400, Brian Haley wrote:
> It's not much different from what's there:
>
> setsockopt("foo");
>
> rename foo -> bar
>
> index = getsockopt();
> if_indextoname(index) -> "bar"
>
> I more raised the issue since you pass a 'char *' to setsockopt() but an 'int *'
> to getsockopt(), I don't think any other value is non-symmetrical like this.
>
> -Brian
I meant another cpu can be changing dev->name[] content while the
strcpy() is done, and you get a mangled devname, like "for" or "bao"
instead of "foo" or "bar"
But yes, I obviously understood your point about "char *" and "int *"
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 21:37 ` Eric Dumazet
@ 2012-10-22 21:47 ` Brian Haley
2012-10-22 21:52 ` Eric Dumazet
0 siblings, 1 reply; 15+ messages in thread
From: Brian Haley @ 2012-10-22 21:47 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Pavel Emelyanov, Linux Netdev List, David Miller
On 10/22/2012 05:37 PM, Eric Dumazet wrote:
> On Mon, 2012-10-22 at 17:20 -0400, Brian Haley wrote:
>
>> It's not much different from what's there:
>>
>> setsockopt("foo");
>>
>> rename foo -> bar
>>
>> index = getsockopt();
>> if_indextoname(index) -> "bar"
>>
>> I more raised the issue since you pass a 'char *' to setsockopt() but an 'int *'
>> to getsockopt(), I don't think any other value is non-symmetrical like this.
>>
>> -Brian
>
> I meant another cpu can be changing dev->name[] content while the
> strcpy() is done, and you get a mangled devname, like "for" or "bao"
> instead of "foo" or "bar"
Even when holding the rcu_read_lock()? I'd have to hold the rtnl lock there?
-Brian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 21:47 ` Brian Haley
@ 2012-10-22 21:52 ` Eric Dumazet
2012-10-22 21:58 ` Ben Hutchings
2012-10-23 21:43 ` Brian Haley
0 siblings, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2012-10-22 21:52 UTC (permalink / raw)
To: Brian Haley; +Cc: Pavel Emelyanov, Linux Netdev List, David Miller
On Mon, 2012-10-22 at 17:47 -0400, Brian Haley wrote:
> On 10/22/2012 05:37 PM, Eric Dumazet wrote:
> > On Mon, 2012-10-22 at 17:20 -0400, Brian Haley wrote:
> >
> >> It's not much different from what's there:
> >>
> >> setsockopt("foo");
> >>
> >> rename foo -> bar
> >>
> >> index = getsockopt();
> >> if_indextoname(index) -> "bar"
> >>
> >> I more raised the issue since you pass a 'char *' to setsockopt() but an 'int *'
> >> to getsockopt(), I don't think any other value is non-symmetrical like this.
> >>
> >> -Brian
> >
> > I meant another cpu can be changing dev->name[] content while the
> > strcpy() is done, and you get a mangled devname, like "for" or "bao"
> > instead of "foo" or "bar"
>
> Even when holding the rcu_read_lock()? I'd have to hold the rtnl lock there?
Yes, rcu_read_lock() only makes sure the device doesnt disappear.
But its name can be changed.
You could use a seqcount_t, so that readers dont have to lock rtnl.
But do we really want to return a name here, I am not yet convinced.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 21:52 ` Eric Dumazet
@ 2012-10-22 21:58 ` Ben Hutchings
2012-10-23 21:43 ` Brian Haley
1 sibling, 0 replies; 15+ messages in thread
From: Ben Hutchings @ 2012-10-22 21:58 UTC (permalink / raw)
To: Eric Dumazet
Cc: Brian Haley, Pavel Emelyanov, Linux Netdev List, David Miller
On Mon, 2012-10-22 at 23:52 +0200, Eric Dumazet wrote:
> On Mon, 2012-10-22 at 17:47 -0400, Brian Haley wrote:
> > On 10/22/2012 05:37 PM, Eric Dumazet wrote:
> > > On Mon, 2012-10-22 at 17:20 -0400, Brian Haley wrote:
> > >
> > >> It's not much different from what's there:
> > >>
> > >> setsockopt("foo");
> > >>
> > >> rename foo -> bar
> > >>
> > >> index = getsockopt();
> > >> if_indextoname(index) -> "bar"
> > >>
> > >> I more raised the issue since you pass a 'char *' to setsockopt() but an 'int *'
> > >> to getsockopt(), I don't think any other value is non-symmetrical like this.
> > >>
> > >> -Brian
> > >
> > > I meant another cpu can be changing dev->name[] content while the
> > > strcpy() is done, and you get a mangled devname, like "for" or "bao"
> > > instead of "foo" or "bar"
> >
> > Even when holding the rcu_read_lock()? I'd have to hold the rtnl lock there?
>
> Yes, rcu_read_lock() only makes sure the device doesnt disappear.
>
> But its name can be changed.
>
> You could use a seqcount_t, so that readers dont have to lock rtnl.
>
> But do we really want to return a name here, I am not yet convinced.
If setsockopt() takes a name then it makes no sense that getsockopt()
would return an index. Perhaps an SO_BINDTOIFINDEX would be useful, but
let's not make SO_BINDTODEVICE mean two different things.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable
2012-10-22 21:52 ` Eric Dumazet
2012-10-22 21:58 ` Ben Hutchings
@ 2012-10-23 21:43 ` Brian Haley
1 sibling, 0 replies; 15+ messages in thread
From: Brian Haley @ 2012-10-23 21:43 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Pavel Emelyanov, Linux Netdev List, David Miller
On 10/22/2012 05:52 PM, Eric Dumazet wrote:
>>> I meant another cpu can be changing dev->name[] content while the
>>> strcpy() is done, and you get a mangled devname, like "for" or "bao"
>>> instead of "foo" or "bar"
>>
>> Even when holding the rcu_read_lock()? I'd have to hold the rtnl lock there?
>
> Yes, rcu_read_lock() only makes sure the device doesnt disappear.
>
> But its name can be changed.
There's a similar bug in the SIOCGIFNAME/dev_ifname() code too then, but I would
think this rare enough that it doesn't happen in practice.
-Brian
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-10-23 21:43 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-19 9:55 [PATCH net-next] sockopt: Make SO_BINDTODEVICE readable Pavel Emelyanov
2012-10-19 10:34 ` Eric Dumazet
2012-10-22 0:43 ` David Miller
2012-10-22 10:20 ` Pavel Emelyanov
2012-10-22 18:04 ` Brian Haley
2012-10-22 20:28 ` Pavel Emelyanov
2012-10-22 21:23 ` Brian Haley
2012-10-22 20:45 ` Eric Dumazet
2012-10-22 21:10 ` Pavel Emelyanov
2012-10-22 21:20 ` Brian Haley
2012-10-22 21:37 ` Eric Dumazet
2012-10-22 21:47 ` Brian Haley
2012-10-22 21:52 ` Eric Dumazet
2012-10-22 21:58 ` Ben Hutchings
2012-10-23 21:43 ` Brian Haley
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.