All of lore.kernel.org
 help / color / mirror / Atom feed
* inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
@ 2011-01-16  8:54 Arthur Marsh
  2011-01-16  9:21 ` Eric Dumazet
  0 siblings, 1 reply; 39+ messages in thread
From: Arthur Marsh @ 2011-01-16  8:54 UTC (permalink / raw)
  To: netdev

 
 
 
 This bug was originally posted at https://bugzilla.kernel.org/show_bug.cgi?id=26632 
 
With kernels up to and including 2.6.37-git7, inbound telnetd-ssl connections worked fine. 
With kernel 2.6.37-git9 and later inbound telnetd-ssl connections failed, and on machine shut-down, there were warning messages about daemons not return status. 
 
A git bisect on Linus' kernel tree revealed: 
 
0ab03c2b1478f2438d2c80204f7fef65b1bca9cf is the first bad commit 
commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf 
Author: Jan Engelhardt <jengelh@medozas.de> 
Date:   Fri Jan 7 03:15:05 2011 +0000 
 
    netlink: test for all flags of the NLM_F_DUMP composite 
 
    Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH, 
    when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits 
    being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL, 
    non-dump requests with NLM_F_EXCL set are mistaken as dump requests. 
 
    Substitute the condition to test for _all_ bits being set. 
 
    Signed-off-by: Jan Engelhardt <jengelh@medozas.de> 
    Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> 
    Signed-off-by: David S. Miller <davem@davemloft.net> 
 
:040000 040000 1a0717ab0c87787309c3c3af88d666b44f327f64 
cba6279de85b7ebeaf21f19f1d93b59468fdd01d M      net 
 
I tried git cherry-pick 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf and verified 
that the resulting kernel had these problems, then git revert 
0ab03c2b1478f2438d2c80204f7fef65b1bca9cf and verified that the resulting kernel 
did *not* have problems. 
 
Arthur. 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-16  8:54 inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied Arthur Marsh
@ 2011-01-16  9:21 ` Eric Dumazet
  2011-01-16 10:50   ` Jan Engelhardt
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Dumazet @ 2011-01-16  9:21 UTC (permalink / raw)
  To: arthur.marsh; +Cc: netdev, Jan Engelhardt, Pablo Neira Ayuso

Le dimanche 16 janvier 2011 à 19:24 +1030, Arthur Marsh a écrit :

CC people involved with the commit

> 
>  
>  This bug was originally posted at https://bugzilla.kernel.org/show_bug.cgi?id=26632 
> 
> With kernels up to and including 2.6.37-git7, inbound telnetd-ssl connections worked fine. 
> With kernel 2.6.37-git9 and later inbound telnetd-ssl connections failed, and on machine shut-down, there were warning messages about daemons not return status. 
>  
> A git bisect on Linus' kernel tree revealed: 
>  
> 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf is the first bad commit 
> commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf 
> Author: Jan Engelhardt <jengelh@medozas.de> 
> Date:   Fri Jan 7 03:15:05 2011 +0000 
>  
>     netlink: test for all flags of the NLM_F_DUMP composite 
>  
>     Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH, 
>     when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits 
>     being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL, 
>     non-dump requests with NLM_F_EXCL set are mistaken as dump requests. 
>  
>     Substitute the condition to test for _all_ bits being set. 
>  
>     Signed-off-by: Jan Engelhardt <jengelh@medozas.de> 
>     Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> 
>     Signed-off-by: David S. Miller <davem@davemloft.net> 
>  
> :040000 040000 1a0717ab0c87787309c3c3af88d666b44f327f64 
> cba6279de85b7ebeaf21f19f1d93b59468fdd01d M      net 
>  
> I tried git cherry-pick 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf and verified 
> that the resulting kernel had these problems, then git revert 
> 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf and verified that the resulting kernel 
> did *not* have problems. 
>  
> Arthur. 



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-16  9:21 ` Eric Dumazet
@ 2011-01-16 10:50   ` Jan Engelhardt
  2011-01-16 12:39     ` Arthur Marsh
       [not found]     ` <4D32E3BA.5040008@internode.on.net>
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Engelhardt @ 2011-01-16 10:50 UTC (permalink / raw)
  To: arthur.marsh
  Cc: Linux Networking Developer Mailing List, Eric Dumazet, Pablo Neira Ayuso


Le dimanche 16 janvier 2011 à 19:24 +1030, Arthur Marsh a écrit :
>
>>With kernels up to and including 2.6.37-git7, inbound telnetd-ssl
>>connections worked fine. With kernel 2.6.37-git9 and later inbound
>>telnetd-ssl connections failed, and on machine shut-down, there
>>were warning messages about daemons not return status.

Which daemons are these? For reference, what distro do you happen
to use?

>> commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf 
>>     netlink: test for all flags of the NLM_F_DUMP composite 

Each of the hunks in this commit is independent of another.
Would you mind bisecting these too?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-16 10:50   ` Jan Engelhardt
@ 2011-01-16 12:39     ` Arthur Marsh
       [not found]     ` <4D32E3BA.5040008@internode.on.net>
  1 sibling, 0 replies; 39+ messages in thread
From: Arthur Marsh @ 2011-01-16 12:39 UTC (permalink / raw)
  To: Linux Networking Developer Mailing List

[-- Attachment #1: Type: text/plain, Size: 963 bytes --]

Jan Engelhardt wrote, on 16/01/11 21:20:
 >
 > Le dimanche 16 janvier 2011 à 19:24 +1030, Arthur Marsh a écrit :
 >>
 >>> With kernels up to and including 2.6.37-git7, inbound telnetd-ssl
 >>> connections worked fine. With kernel 2.6.37-git9 and later inbound
 >>> telnetd-ssl connections failed, and on machine shut-down, there
 >>> were warning messages about daemons not return status.
 >
 > Which daemons are these? For reference, what distro do you happen
 > to use?

avahi-daemon (which gave multiple warning messages, hence I thought it 
may have been multiple packages)

I'm running Debian unstable with kernel.org kernels.

 >
 >>> commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf
 >>>      netlink: test for all flags of the NLM_F_DUMP composite
 >
 > Each of the hunks in this commit is independent of another.
 > Would you mind bisecting these too?

Recompiling with the only the first patch (attached) resulted in a 
repeat of the problem.

Arthur.

[-- Attachment #2: netlink1.patch --]
[-- Type: text/x-patch, Size: 489 bytes --]

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 750db57..a5f7535 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1820,7 +1820,7 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {
+	if (kind == 2 && (nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) {
 		struct sock *rtnl;
 		rtnl_dumpit_func dumpit;
 
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
       [not found]     ` <4D32E3BA.5040008@internode.on.net>
@ 2011-01-16 21:17       ` Pablo Neira Ayuso
  2011-01-17  1:03         ` Arthur Marsh
  2011-01-18  9:38         ` Jarek Poplawski
  0 siblings, 2 replies; 39+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-16 21:17 UTC (permalink / raw)
  To: Arthur Marsh; +Cc: Jan Engelhardt, Eric Dumazet, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1337 bytes --]

On 16/01/11 13:25, Arthur Marsh wrote:
> Jan Engelhardt wrote, on 16/01/11 21:20:
>>
>> Le dimanche 16 janvier 2011 à 19:24 +1030, Arthur Marsh a écrit :
>>>
>>>> With kernels up to and including 2.6.37-git7, inbound telnetd-ssl
>>>> connections worked fine. With kernel 2.6.37-git9 and later inbound
>>>> telnetd-ssl connections failed, and on machine shut-down, there
>>>> were warning messages about daemons not return status.
>>
>> Which daemons are these? For reference, what distro do you happen
>> to use?
> 
> avahi-daemon (which gave multiple warning messages, hence I thought it
> may have been multiple packages)
> 
> I'm running Debian unstable with kernel.org kernels.
> 
>>
>>>> commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf
>>>>      netlink: test for all flags of the NLM_F_DUMP composite
>>
>> Each of the hunks in this commit is independent of another.
>> Would you mind bisecting these too?
> 
> Recompiling with the only the first patch (attached) resulted in a
> repeat of the problem.
> 
> I've removed one person from the cc: list as they did not want to
> receive email about this even though they signed off the commit.

Please, pass this patch to the avahi-daemon developers. They use an
invalid netlink flag combination for dump operations.

This patch has spotted a problem that they have to fix indeed.


[-- Attachment #2: fix-netlink-in-avahi.patch --]
[-- Type: text/x-patch, Size: 1687 bytes --]

avahi: fix wrong use of netlink flags for dump operations

The avahi-daemon uses a wrong flag combination to operate with
rtnetlink. This patch fixes the problems.

No need to set NLM_F_ACK since the dump operation already includes
the trailing NLMSG_DONE message that informs about the end of the
dump operation.

Please, consider porting the avahi-daemon to libmnl:

http://www.netfilter.org/projects/libmnl/index.html

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Index: avahi-0.6.27/avahi-autoipd/iface-linux.c
===================================================================
--- avahi-0.6.27.orig/avahi-autoipd/iface-linux.c	2011-01-16 22:06:20.000000000 +0100
+++ avahi-0.6.27/avahi-autoipd/iface-linux.c	2011-01-16 22:07:34.000000000 +0100
@@ -262,7 +262,7 @@ int iface_get_initial_state(State *state
     n->nlmsg_len = NLMSG_LENGTH(sizeof(*ifi));
     n->nlmsg_type = RTM_GETLINK;
     n->nlmsg_seq = seq;
-    n->nlmsg_flags = NLM_F_MATCH|NLM_F_REQUEST|NLM_F_ACK;
+    n->nlmsg_flags = NLM_F_REQUEST|NLM_F_DUMP;
     n->nlmsg_pid = 0;
 
     ifi = NLMSG_DATA(n);
Index: avahi-0.6.27/avahi-core/iface-linux.c
===================================================================
--- avahi-0.6.27.orig/avahi-core/iface-linux.c	2011-01-16 22:06:51.000000000 +0100
+++ avahi-0.6.27/avahi-core/iface-linux.c	2011-01-16 22:07:08.000000000 +0100
@@ -53,7 +53,7 @@ static int netlink_list_items(AvahiNetli
     n = (struct nlmsghdr*) req;
     n->nlmsg_len = NLMSG_LENGTH(sizeof(struct rtgenmsg));
     n->nlmsg_type = type;
-    n->nlmsg_flags = NLM_F_ROOT|NLM_F_REQUEST;
+    n->nlmsg_flags = NLM_F_REQUEST|NLM_F_DUMP;
     n->nlmsg_pid = 0;
 
     gen = NLMSG_DATA(n);


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-16 21:17       ` Pablo Neira Ayuso
@ 2011-01-17  1:03         ` Arthur Marsh
  2011-01-18  9:38         ` Jarek Poplawski
  1 sibling, 0 replies; 39+ messages in thread
From: Arthur Marsh @ 2011-01-17  1:03 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jan Engelhardt, Eric Dumazet, Linux Netdev List



Pablo Neira Ayuso wrote, on 17/01/11 07:47:
> On 16/01/11 13:25, Arthur Marsh wrote:
>> Jan Engelhardt wrote, on 16/01/11 21:20:
>>>
>>> Le dimanche 16 janvier 2011 à 19:24 +1030, Arthur Marsh a écrit :
>>>>
>>>>> With kernels up to and including 2.6.37-git7, inbound telnetd-ssl
>>>>> connections worked fine. With kernel 2.6.37-git9 and later inbound
>>>>> telnetd-ssl connections failed, and on machine shut-down, there
>>>>> were warning messages about daemons not return status.
>>>
>>> Which daemons are these? For reference, what distro do you happen
>>> to use?
>>
>> avahi-daemon (which gave multiple warning messages, hence I thought it
>> may have been multiple packages)
>>
>> I'm running Debian unstable with kernel.org kernels.
>>
>>>
>>>>> commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf
>>>>>       netlink: test for all flags of the NLM_F_DUMP composite
>>>
>>> Each of the hunks in this commit is independent of another.
>>> Would you mind bisecting these too?
>>
>> Recompiling with the only the first patch (attached) resulted in a
>> repeat of the problem.
>>
>> I've removed one person from the cc: list as they did not want to
>> receive email about this even though they signed off the commit.
>
> Please, pass this patch to the avahi-daemon developers. They use an
> invalid netlink flag combination for dump operations.
>
> This patch has spotted a problem that they have to fix indeed.
>

I've forwarded this on as a Debian bug:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=610281

Regards,

Arthur.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-16 21:17       ` Pablo Neira Ayuso
  2011-01-17  1:03         ` Arthur Marsh
@ 2011-01-18  9:38         ` Jarek Poplawski
  2011-01-18 10:07           ` David Miller
  1 sibling, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18  9:38 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Arthur Marsh, Jan Engelhardt, Eric Dumazet, Linux Netdev List,
	Jamal Hadi Salim

On 2011-01-16 22:17, Pablo Neira Ayuso wrote:
> On 16/01/11 13:25, Arthur Marsh wrote:
>> Jan Engelhardt wrote, on 16/01/11 21:20:
>>>
>>> Le dimanche 16 janvier 2011 Ă  19:24 +1030, Arthur Marsh a ĂŠcrit :
>>>>
>>>>> With kernels up to and including 2.6.37-git7, inbound telnetd-ssl
>>>>> connections worked fine. With kernel 2.6.37-git9 and later inbound
>>>>> telnetd-ssl connections failed, and on machine shut-down, there
>>>>> were warning messages about daemons not return status.
>>>
>>> Which daemons are these? For reference, what distro do you happen
>>> to use?
>>
>> avahi-daemon (which gave multiple warning messages, hence I thought it
>> may have been multiple packages)
>>
>> I'm running Debian unstable with kernel.org kernels.
>>
>>>
>>>>> commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf
>>>>>      netlink: test for all flags of the NLM_F_DUMP composite
>>>
>>> Each of the hunks in this commit is independent of another.
>>> Would you mind bisecting these too?
>>
>> Recompiling with the only the first patch (attached) resulted in a
>> repeat of the problem.
>>
>> I've removed one person from the cc: list as they did not want to
>> receive email about this even though they signed off the commit.
> 
> Please, pass this patch to the avahi-daemon developers. They use an
> invalid netlink flag combination for dump operations.

Nothing in RFC suggests this is an invalid netlink flag combination,
and author's implementation had suported it:
ftp://ftp.rfc-editor.org/in-notes/rfc3549.txt

NLM_F_DUMP is called a convenience macro only, so might be interpreted
as a special kind of dump. BTW, isn't NLM_F_ATOMIC flag with
NLM_F_DUMP treated as invalid now either?

Even if I'm wrong, this change added to stable will break many configs.
My proposal is to revert commit 0ab03c2b147 until proper fix is found.

Cc: Jamal Hadi Salim <hadi@cyberus.ca>

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18  9:38         ` Jarek Poplawski
@ 2011-01-18 10:07           ` David Miller
  2011-01-18 10:24             ` Jarek Poplawski
  0 siblings, 1 reply; 39+ messages in thread
From: David Miller @ 2011-01-18 10:07 UTC (permalink / raw)
  To: jarkao2; +Cc: pablo, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Tue, 18 Jan 2011 09:38:11 +0000

> Even if I'm wrong, this change added to stable will break many configs.
> My proposal is to revert commit 0ab03c2b147 until proper fix is found.

The flag combination is, at best ambiguous, it has no proper
definition without the check we added.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 10:07           ` David Miller
@ 2011-01-18 10:24             ` Jarek Poplawski
  2011-01-18 14:05               ` jamal
  2011-01-18 20:31               ` Pablo Neira Ayuso
  0 siblings, 2 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 10:24 UTC (permalink / raw)
  To: David Miller; +Cc: pablo, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On Tue, Jan 18, 2011 at 02:07:02AM -0800, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Tue, 18 Jan 2011 09:38:11 +0000
> 
> > Even if I'm wrong, this change added to stable will break many configs.
> > My proposal is to revert commit 0ab03c2b147 until proper fix is found.
> 
> The flag combination is, at best ambiguous, it has no proper
> definition without the check we added.

Do you all expect all users manage to upgrade avahi app before
changing their stable kernel? I mean "own distro" users especially.

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 10:24             ` Jarek Poplawski
@ 2011-01-18 14:05               ` jamal
  2011-01-18 14:07                 ` jamal
                                   ` (2 more replies)
  2011-01-18 20:31               ` Pablo Neira Ayuso
  1 sibling, 3 replies; 39+ messages in thread
From: jamal @ 2011-01-18 14:05 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev

On Tue, 2011-01-18 at 10:24 +0000, Jarek Poplawski wrote:

> Do you all expect all users manage to upgrade avahi app before
> changing their stable kernel? I mean "own distro" users especially.

Unfortunately if that app is widely deployed, it is not pragmatic
to break it in the name of pedanticity. i.e.
Be conservative in what you send (clearly Avahi is farting all over the
place) but more importantly be a nice liberal in what you accept.
Maybe tell them if you have the cycles about their bad behavior.
The important part is the GET (kind = 2). The DUMP as Jarek says
is merely a utility to extrapolate that we need you to
get everything.. So it is not such a big deal if someone passes
extraneous senseless flags. Wrong? yes.

Jarek - For the record although i coauthored the doc, I was merely
a messenger in putting together some artist painting.

cheers,
jamal


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 14:05               ` jamal
@ 2011-01-18 14:07                 ` jamal
  2011-01-18 17:22                   ` Jarek Poplawski
  2011-01-18 18:11                 ` Jarek Poplawski
  2011-01-18 20:39                 ` David Miller
  2 siblings, 1 reply; 39+ messages in thread
From: jamal @ 2011-01-18 14:07 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev

On Tue, 2011-01-18 at 09:05 -0500, jamal wrote:
> On Tue, 2011-01-18 at 10:24 +0000, Jarek Poplawski wrote:
> 
> > Do you all expect all users manage to upgrade avahi app before
> > changing their stable kernel? I mean "own distro" users especially.
> 
> Unfortunately if that app is widely deployed, it is not pragmatic
> to break it in the name of pedanticity. 

And to complete that thought - if avahi continues to work and merely
whines, i dont see why this patch should be reverted..

cheers,
jamal



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 14:07                 ` jamal
@ 2011-01-18 17:22                   ` Jarek Poplawski
  0 siblings, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 17:22 UTC (permalink / raw)
  To: jamal
  Cc: David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev,
	Alessandro Suardi

On Tue, Jan 18, 2011 at 09:07:33AM -0500, jamal wrote:
> On Tue, 2011-01-18 at 09:05 -0500, jamal wrote:
> > On Tue, 2011-01-18 at 10:24 +0000, Jarek Poplawski wrote:
> > 
> > > Do you all expect all users manage to upgrade avahi app before
> > > changing their stable kernel? I mean "own distro" users especially.
> > 
> > Unfortunately if that app is widely deployed, it is not pragmatic
> > to break it in the name of pedanticity. 
> 
> And to complete that thought - if avahi continues to work and merely
> whines, i dont see why this patch should be reverted..

Alas it looks like more than whines ;-)
Cheers,
Jarek P.

[PATCH] netlink: Revert test for all flags of the NLM_F_DUMP composite

Arthur Marsh reported:
> With kernel 2.6.37-git9 and later inbound telnetd-ssl connections
> failed, and on machine shut-down, there were warning messages about
> daemons not return status.
and bisected the bug.
 
After looking at net/core/rtnetlink.c:

static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
{
	...
        kind = type&3;

        if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN))
                return -EPERM;

        if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {

I'm sure the fix in commit 0ab03c2b1478f24 is simply wrong. NLM_F_DUMP
flags can't be mistaken with NLM_F_EXCL if there is a check for GET
request like above (ie. kind == 2). So there is no reason to limit 3
various dump cases from the RFC (not counting the atomic flag) to one
and only NLM_F_DUMP.

That's why I propose this reverting patch here and a separate fix to
genetlink (soon).

Reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf

Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Reported-by: Arthur Marsh <arthur.marsh@internode.on.net>
Bisected-by: Arthur Marsh <arthur.marsh@internode.on.net>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jamal Hadi Salim <hadi@cyberus.ca>
---

 net/core/rtnetlink.c                 |    2 +-
 net/ipv4/inet_diag.c                 |    2 +-
 net/netfilter/nf_conntrack_netlink.c |    4 ++--
 net/netlink/genetlink.c              |    2 +-
 net/xfrm/xfrm_user.c                 |    2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a5f7535..750db57 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1820,7 +1820,7 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (kind == 2 && (nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) {
+	if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {
 		struct sock *rtnl;
 		rtnl_dumpit_func dumpit;
 
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 2746c1f..2ada171 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -858,7 +858,7 @@ static int inet_diag_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	    nlmsg_len(nlh) < hdrlen)
 		return -EINVAL;
 
-	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) {
+	if (nlh->nlmsg_flags & NLM_F_DUMP) {
 		if (nlmsg_attrlen(nlh, hdrlen)) {
 			struct nlattr *attr;
 
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 2b7eef3..93297aa 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -924,7 +924,7 @@ ctnetlink_get_conntrack(struct sock *ctnl, struct sk_buff *skb,
 	u16 zone;
 	int err;
 
-	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP)
+	if (nlh->nlmsg_flags & NLM_F_DUMP)
 		return netlink_dump_start(ctnl, skb, nlh, ctnetlink_dump_table,
 					  ctnetlink_done);
 
@@ -1787,7 +1787,7 @@ ctnetlink_get_expect(struct sock *ctnl, struct sk_buff *skb,
 	u16 zone;
 	int err;
 
-	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) {
+	if (nlh->nlmsg_flags & NLM_F_DUMP) {
 		return netlink_dump_start(ctnl, skb, nlh,
 					  ctnetlink_exp_dump_table,
 					  ctnetlink_exp_done);
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index f83cb37..1781d99 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -519,7 +519,7 @@ static int genl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 	    security_netlink_recv(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) {
+	if (nlh->nlmsg_flags & NLM_F_DUMP) {
 		if (ops->dumpit == NULL)
 			return -EOPNOTSUPP;
 
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index d5e1e0b..6129196 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2189,7 +2189,7 @@ static int xfrm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 	if ((type == (XFRM_MSG_GETSA - XFRM_MSG_BASE) ||
 	     type == (XFRM_MSG_GETPOLICY - XFRM_MSG_BASE)) &&
-	    (nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) {
+	    (nlh->nlmsg_flags & NLM_F_DUMP)) {
 		if (link->dump == NULL)
 			return -EINVAL;
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 14:05               ` jamal
  2011-01-18 14:07                 ` jamal
@ 2011-01-18 18:11                 ` Jarek Poplawski
  2011-01-18 20:39                 ` David Miller
  2 siblings, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 18:11 UTC (permalink / raw)
  To: jamal; +Cc: David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev

On Tue, Jan 18, 2011 at 09:05:03AM -0500, jamal wrote:
> On Tue, 2011-01-18 at 10:24 +0000, Jarek Poplawski wrote:
> 
> > Do you all expect all users manage to upgrade avahi app before
> > changing their stable kernel? I mean "own distro" users especially.
> 
> Unfortunately if that app is widely deployed, it is not pragmatic
> to break it in the name of pedanticity. i.e.
> Be conservative in what you send (clearly Avahi is farting all over the
> place) but more importantly be a nice liberal in what you accept.
> Maybe tell them if you have the cycles about their bad behavior.
> The important part is the GET (kind = 2). The DUMP as Jarek says
> is merely a utility to extrapolate that we need you to
> get everything.. So it is not such a big deal if someone passes
> extraneous senseless flags. Wrong? yes.

Thanks for the confirmation of my suspicions wrt. the RFC & rtnetlink.
But then Avahi seems right and we should get back to the written law.

> Jarek - For the record although i coauthored the doc, I was merely
> a messenger in putting together some artist painting.

Well, ain't pictures better than words ;-)

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 10:24             ` Jarek Poplawski
  2011-01-18 14:05               ` jamal
@ 2011-01-18 20:31               ` Pablo Neira Ayuso
  2011-01-18 20:50                 ` David Miller
                                   ` (2 more replies)
  1 sibling, 3 replies; 39+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-18 20:31 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: David Miller, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On 18/01/11 11:24, Jarek Poplawski wrote:
> On Tue, Jan 18, 2011 at 02:07:02AM -0800, David Miller wrote:
>> From: Jarek Poplawski <jarkao2@gmail.com>
>> Date: Tue, 18 Jan 2011 09:38:11 +0000
>>
>>> Even if I'm wrong, this change added to stable will break many configs.
>>> My proposal is to revert commit 0ab03c2b147 until proper fix is found.
>>
>> The flag combination is, at best ambiguous, it has no proper
>> definition without the check we added.
> 
> Do you all expect all users manage to upgrade avahi app before
> changing their stable kernel? I mean "own distro" users especially.

The combination that avahi uses makes no sense.

I've been auditing user-space tools that may have problems with this change:

* iw (it uses libnl)
* acpid (it uses a mangled version of libnetlink shipped in iproute)
* tstime, for taskstats, it uses libnl
* wimax-tools, it uses libnl
* quota-tools, it uses libnl
* keepalived, no libs used

Well, I can keep looking for more, but I think that avahi is the only
one doing this incorrectly.

Please, fix avahi instead.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 14:05               ` jamal
  2011-01-18 14:07                 ` jamal
  2011-01-18 18:11                 ` Jarek Poplawski
@ 2011-01-18 20:39                 ` David Miller
  2 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2011-01-18 20:39 UTC (permalink / raw)
  To: hadi; +Cc: jarkao2, pablo, arthur.marsh, jengelh, eric.dumazet, netdev

From: jamal <hadi@cyberus.ca>
Date: Tue, 18 Jan 2011 09:05:03 -0500

> On Tue, 2011-01-18 at 10:24 +0000, Jarek Poplawski wrote:
> 
>> Do you all expect all users manage to upgrade avahi app before
>> changing their stable kernel? I mean "own distro" users especially.
> 
> Unfortunately if that app is widely deployed, it is not pragmatic
> to break it in the name of pedanticity. i.e.
> Be conservative in what you send (clearly Avahi is farting all over the
> place) but more importantly be a nice liberal in what you accept.
> Maybe tell them if you have the cycles about their bad behavior.
> The important part is the GET (kind = 2). The DUMP as Jarek says
> is merely a utility to extrapolate that we need you to
> get everything.. So it is not such a big deal if someone passes
> extraneous senseless flags. Wrong? yes.

Fair enough, I'll revert the netlink patch.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 20:31               ` Pablo Neira Ayuso
@ 2011-01-18 20:50                 ` David Miller
  2011-01-19 17:42                   ` Pablo Neira Ayuso
  2011-01-18 20:55                 ` Jarek Poplawski
  2011-01-18 21:14                 ` Jarek Poplawski
  2 siblings, 1 reply; 39+ messages in thread
From: David Miller @ 2011-01-18 20:50 UTC (permalink / raw)
  To: pablo; +Cc: jarkao2, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue, 18 Jan 2011 21:31:31 +0100

> The combination that avahi uses makes no sense.
> 
> I've been auditing user-space tools that may have problems with this change:
> 
> * iw (it uses libnl)
> * acpid (it uses a mangled version of libnetlink shipped in iproute)
> * tstime, for taskstats, it uses libnl
> * wimax-tools, it uses libnl
> * quota-tools, it uses libnl
> * keepalived, no libs used
> 
> Well, I can keep looking for more, but I think that avahi is the only
> one doing this incorrectly.
> 
> Please, fix avahi instead.

That's a pretty compelling argument, so I'll hold off on the revert
for now.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 20:31               ` Pablo Neira Ayuso
  2011-01-18 20:50                 ` David Miller
@ 2011-01-18 20:55                 ` Jarek Poplawski
  2011-01-19 14:28                   ` jamal
  2011-01-18 21:14                 ` Jarek Poplawski
  2 siblings, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 20:55 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: David Miller, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On Tue, Jan 18, 2011 at 09:31:31PM +0100, Pablo Neira Ayuso wrote:
> On 18/01/11 11:24, Jarek Poplawski wrote:
> > On Tue, Jan 18, 2011 at 02:07:02AM -0800, David Miller wrote:
> >> From: Jarek Poplawski <jarkao2@gmail.com>
> >> Date: Tue, 18 Jan 2011 09:38:11 +0000
> >>
> >>> Even if I'm wrong, this change added to stable will break many configs.
> >>> My proposal is to revert commit 0ab03c2b147 until proper fix is found.
> >>
> >> The flag combination is, at best ambiguous, it has no proper
> >> definition without the check we added.
> > 
> > Do you all expect all users manage to upgrade avahi app before
> > changing their stable kernel? I mean "own distro" users especially.
> 
> The combination that avahi uses makes no sense.

I don't agree as explained in the reverting patch. Anyway, again,
this is an old problem, so no reason to force "fixing" it just now
at the expense of the obvious regression especially in stable kernels
Anyway, I'll accept any David's decision wrt this problem.

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 20:31               ` Pablo Neira Ayuso
  2011-01-18 20:50                 ` David Miller
  2011-01-18 20:55                 ` Jarek Poplawski
@ 2011-01-18 21:14                 ` Jarek Poplawski
  2011-01-19 14:53                   ` Pablo Neira Ayuso
  2 siblings, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 21:14 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: David Miller, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On Tue, Jan 18, 2011 at 09:31:31PM +0100, Pablo Neira Ayuso wrote:
> On 18/01/11 11:24, Jarek Poplawski wrote:
> > On Tue, Jan 18, 2011 at 02:07:02AM -0800, David Miller wrote:
> >> From: Jarek Poplawski <jarkao2@gmail.com>
> >> Date: Tue, 18 Jan 2011 09:38:11 +0000
> >>
> >>> Even if I'm wrong, this change added to stable will break many configs.
> >>> My proposal is to revert commit 0ab03c2b147 until proper fix is found.
> >>
> >> The flag combination is, at best ambiguous, it has no proper
> >> definition without the check we added.
> > 
> > Do you all expect all users manage to upgrade avahi app before
> > changing their stable kernel? I mean "own distro" users especially.
> 
> The combination that avahi uses makes no sense.
> 
> I've been auditing user-space tools that may have problems with this change:
> 
> * iw (it uses libnl)
> * acpid (it uses a mangled version of libnetlink shipped in iproute)
> * tstime, for taskstats, it uses libnl
> * wimax-tools, it uses libnl
> * quota-tools, it uses libnl
> * keepalived, no libs used
> 
> Well, I can keep looking for more, but I think that avahi is the only
> one doing this incorrectly.

BTW, could you answer my earlier question, why NLM_F_ATOMIC flag isn't
handled now with dumps?

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 20:55                 ` Jarek Poplawski
@ 2011-01-19 14:28                   ` jamal
  2011-01-19 16:54                     ` Jarek Poplawski
  0 siblings, 1 reply; 39+ messages in thread
From: jamal @ 2011-01-19 14:28 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Pablo Neira Ayuso, David Miller, arthur.marsh, jengelh,
	eric.dumazet, netdev

On Tue, 2011-01-18 at 21:55 +0100, Jarek Poplawski wrote:
> On Tue, Jan 18, 2011 at 09:31:31PM +0100, Pablo Neira Ayuso wrote:

> > The combination that avahi uses makes no sense.
> 
> I don't agree as explained in the reverting patch. Anyway, again,
> this is an old problem, so no reason to force "fixing" it just now
> at the expense of the obvious regression especially in stable kernels
> Anyway, I'll accept any David's decision wrt this problem.
> 

So here is what i think the criteria should be:

If Avahi is popular and widely deployed (I dont use it anywhere), it
makes no sense to revert. 
A middle ground is: instead of rejecting the nonsense passed, maybe a
sane thing to do is a kernel warning for a period of time (sort of like
feature removal warnings).

The only way to keep the patch IMO (if avahi is widely deployed) is if
common distro policy is such that they will immediately fix and
distribute a new avahi even when this breakage is with a kernel that
distro wont support for  a year.

hope i am making sense.

cheers,
jamal


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 21:14                 ` Jarek Poplawski
@ 2011-01-19 14:53                   ` Pablo Neira Ayuso
  2011-01-19 16:18                     ` Jarek Poplawski
  0 siblings, 1 reply; 39+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-19 14:53 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: David Miller, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On 18/01/11 22:14, Jarek Poplawski wrote:
> On Tue, Jan 18, 2011 at 09:31:31PM +0100, Pablo Neira Ayuso wrote:
>> On 18/01/11 11:24, Jarek Poplawski wrote:
>>> On Tue, Jan 18, 2011 at 02:07:02AM -0800, David Miller wrote:
>>>> From: Jarek Poplawski <jarkao2@gmail.com>
>>>> Date: Tue, 18 Jan 2011 09:38:11 +0000
>>>>
>>>>> Even if I'm wrong, this change added to stable will break many configs.
>>>>> My proposal is to revert commit 0ab03c2b147 until proper fix is found.
>>>>
>>>> The flag combination is, at best ambiguous, it has no proper
>>>> definition without the check we added.
>>>
>>> Do you all expect all users manage to upgrade avahi app before
>>> changing their stable kernel? I mean "own distro" users especially.
>>
>> The combination that avahi uses makes no sense.
>>
>> I've been auditing user-space tools that may have problems with this change:
>>
>> * iw (it uses libnl)
>> * acpid (it uses a mangled version of libnetlink shipped in iproute)
>> * tstime, for taskstats, it uses libnl
>> * wimax-tools, it uses libnl
>> * quota-tools, it uses libnl
>> * keepalived, no libs used
>>
>> Well, I can keep looking for more, but I think that avahi is the only
>> one doing this incorrectly.
> 
> BTW, could you answer my earlier question, why NLM_F_ATOMIC flag isn't
> handled now with dumps?

The NLM_F_ATOMIC flag is not affected, the netlink header is still
passed to netlink_dump_start() so you can check for it in the callback
to start an atomic dump.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 14:53                   ` Pablo Neira Ayuso
@ 2011-01-19 16:18                     ` Jarek Poplawski
  0 siblings, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-19 16:18 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: David Miller, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On Wed, Jan 19, 2011 at 03:53:17PM +0100, Pablo Neira Ayuso wrote:
> On 18/01/11 22:14, Jarek Poplawski wrote:
...
> > BTW, could you answer my earlier question, why NLM_F_ATOMIC flag isn't
> > handled now with dumps?
> 
> The NLM_F_ATOMIC flag is not affected, the netlink header is still
> passed to netlink_dump_start() so you can check for it in the callback
> to start an atomic dump.

Right! I had some blackout, sorry.

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 14:28                   ` jamal
@ 2011-01-19 16:54                     ` Jarek Poplawski
  2011-01-19 16:59                       ` jamal
  2011-01-19 18:04                       ` Jan Engelhardt
  0 siblings, 2 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-19 16:54 UTC (permalink / raw)
  To: jamal
  Cc: Pablo Neira Ayuso, David Miller, arthur.marsh, jengelh,
	eric.dumazet, netdev

On Wed, Jan 19, 2011 at 09:28:06AM -0500, jamal wrote:
> On Tue, 2011-01-18 at 21:55 +0100, Jarek Poplawski wrote:
> > On Tue, Jan 18, 2011 at 09:31:31PM +0100, Pablo Neira Ayuso wrote:
> 
> > > The combination that avahi uses makes no sense.
> > 
> > I don't agree as explained in the reverting patch. Anyway, again,
> > this is an old problem, so no reason to force "fixing" it just now
> > at the expense of the obvious regression especially in stable kernels
> > Anyway, I'll accept any David's decision wrt this problem.
> > 
> 
> So here is what i think the criteria should be:
> 
> If Avahi is popular and widely deployed (I dont use it anywhere), it
> makes no sense to revert. 
> A middle ground is: instead of rejecting the nonsense passed, maybe a
> sane thing to do is a kernel warning for a period of time (sort of like
> feature removal warnings).

I still don't understand why you call this the nonsense. There are
two dump flags NLM_F_ROOT and NLM_F_MATCH plus for convenience
NLM_F_DUMP as 2 in 1. Avahi uses these specific flags. Why would
anybody have added these specific flags if they can never be used
separately?

Aside from this question, if we still think it's the nonsense, a
warning would be nicer.

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 16:54                     ` Jarek Poplawski
@ 2011-01-19 16:59                       ` jamal
  2011-01-19 17:19                         ` Jarek Poplawski
  2011-01-19 17:33                         ` Jarek Poplawski
  2011-01-19 18:04                       ` Jan Engelhardt
  1 sibling, 2 replies; 39+ messages in thread
From: jamal @ 2011-01-19 16:59 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Pablo Neira Ayuso, David Miller, arthur.marsh, jengelh,
	eric.dumazet, netdev

On Wed, 2011-01-19 at 17:54 +0100, Jarek Poplawski wrote:
> On Wed, Jan 19, 2011 at 09:28:06AM -0500, jamal wrote:

> > So here is what i think the criteria should be:
> > 
> > If Avahi is popular and widely deployed (I dont use it anywhere), it
> > makes no sense to revert. 
> > A middle ground is: instead of rejecting the nonsense passed, maybe a
> > sane thing to do is a kernel warning for a period of time (sort of like
> > feature removal warnings).
> 
> I still don't understand why you call this the nonsense. 


gah! I already had plenty of caffeine when i typed that.
I meant to say "If Avahi is popular and widely deployed,
it makes sense to revert"

> There are
> two dump flags NLM_F_ROOT and NLM_F_MATCH plus for convenience
> NLM_F_DUMP as 2 in 1. Avahi uses these specific flags. Why would
> anybody have added these specific flags if they can never be used
> separately?
> 
> Aside from this question, if we still think it's the nonsense, a
> warning would be nicer.

That is what i was suggesting as well..

cheers,
jamal


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 16:59                       ` jamal
@ 2011-01-19 17:19                         ` Jarek Poplawski
  2011-01-19 17:33                         ` Jarek Poplawski
  1 sibling, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-19 17:19 UTC (permalink / raw)
  To: jamal
  Cc: Pablo Neira Ayuso, David Miller, arthur.marsh, jengelh,
	eric.dumazet, netdev

On Wed, Jan 19, 2011 at 11:59:37AM -0500, jamal wrote:
> On Wed, 2011-01-19 at 17:54 +0100, Jarek Poplawski wrote:
...
> gah! I already had plenty of caffeine when i typed that.
> I meant to say "If Avahi is popular and widely deployed,
> it makes sense to revert"

AFAIK, the most popular distros (XP, Vista, W7) don't use Avahi! ;-)
Other similar (desktop & userfriendly) should be affected.

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 16:59                       ` jamal
  2011-01-19 17:19                         ` Jarek Poplawski
@ 2011-01-19 17:33                         ` Jarek Poplawski
  1 sibling, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-19 17:33 UTC (permalink / raw)
  To: jamal
  Cc: Pablo Neira Ayuso, David Miller, arthur.marsh, jengelh,
	eric.dumazet, netdev

On Wed, Jan 19, 2011 at 11:59:37AM -0500, jamal wrote:
> On Wed, 2011-01-19 at 17:54 +0100, Jarek Poplawski wrote:
...
> > Aside from this question, if we still think it's the nonsense, a
> > warning would be nicer.
> 
> That is what i was suggesting as well..

Except, I still don't think it's the nonsese, and suggest something
even nicer ;-)

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 20:50                 ` David Miller
@ 2011-01-19 17:42                   ` Pablo Neira Ayuso
  2011-01-19 21:34                     ` David Miller
  0 siblings, 1 reply; 39+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-19 17:42 UTC (permalink / raw)
  To: David Miller; +Cc: jarkao2, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

On 18/01/11 21:50, David Miller wrote:
> From: Pablo Neira Ayuso <pablo@netfilter.org>
> Date: Tue, 18 Jan 2011 21:31:31 +0100
> 
>> The combination that avahi uses makes no sense.
>>
>> I've been auditing user-space tools that may have problems with this change:
>>
>> * iw (it uses libnl)
>> * acpid (it uses a mangled version of libnetlink shipped in iproute)
>> * tstime, for taskstats, it uses libnl
>> * wimax-tools, it uses libnl
>> * quota-tools, it uses libnl
>> * keepalived, no libs used
>>
>> Well, I can keep looking for more, but I think that avahi is the only
>> one doing this incorrectly.
>>
>> Please, fix avahi instead.
> 
> That's a pretty compelling argument, so I'll hold off on the revert
> for now.

I've been reviewing user-space applications for a couple of hours (I've
got a big list here with no problems), unfortunately I found that:

ip route show cache

hangs after displaying the first line with the patch applied because it
uses:

        req.nlh.nlmsg_type = RTM_GETROUTE;
        req.nlh.nlmsg_flags = NLM_F_ROOT|NLM_F_REQUEST;

to dump the routing cache.

We need something less agressive, some warning to be printed and accept
this flag combination for quite some time until it's removed as jamal
and jarek suggested.

Please, revert this patch until we find a better solution.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 16:54                     ` Jarek Poplawski
  2011-01-19 16:59                       ` jamal
@ 2011-01-19 18:04                       ` Jan Engelhardt
  2011-01-19 19:24                         ` Jarek Poplawski
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Engelhardt @ 2011-01-19 18:04 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: jamal, Pablo Neira Ayuso, David Miller, arthur.marsh,
	eric.dumazet, netdev


On Wednesday 2011-01-19 17:54, Jarek Poplawski wrote:
>
>I still don't understand why you call this the nonsense. There are
>two dump flags NLM_F_ROOT and NLM_F_MATCH plus for convenience
>NLM_F_DUMP as 2 in 1. Avahi uses these specific flags. Why would
>anybody have added these specific flags if they can never be used
>separately?

It looks like the authors' intentinos were to make NLM_F_MATCH not
stop after a single entry has been found. So that sounds like dump,
ok.

But NLM_F_ROOT does not quite strike me as a dump request. What if I
wanted just a single item returned but still start at the root?

Or asking from a different direction, what's NLM_F_ROOT good for
when, say, struct rtmsg->rtm_table specifies (in rtnetlink) where to
start? (Particularly, 0 for an "invisible root" that contains all
tables.)


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 18:04                       ` Jan Engelhardt
@ 2011-01-19 19:24                         ` Jarek Poplawski
  2011-01-19 19:47                           ` Jan Engelhardt
  0 siblings, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-19 19:24 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: jamal, Pablo Neira Ayuso, David Miller, arthur.marsh,
	eric.dumazet, netdev

On Wed, Jan 19, 2011 at 07:04:06PM +0100, Jan Engelhardt wrote:
> 
> On Wednesday 2011-01-19 17:54, Jarek Poplawski wrote:
> >
> >I still don't understand why you call this the nonsense. There are
> >two dump flags NLM_F_ROOT and NLM_F_MATCH plus for convenience
> >NLM_F_DUMP as 2 in 1. Avahi uses these specific flags. Why would
> >anybody have added these specific flags if they can never be used
> >separately?
> 
> It looks like the authors' intentinos were to make NLM_F_MATCH not
> stop after a single entry has been found. So that sounds like dump,
> ok.
> 
> But NLM_F_ROOT does not quite strike me as a dump request. What if I
> wanted just a single item returned but still start at the root?

Hmm... Does it say about starting at the root?:

"          NLM_F_ROOT     Return the complete table instead of a
                          single entry."

> 
> Or asking from a different direction, what's NLM_F_ROOT good for
> when, say, struct rtmsg->rtm_table specifies (in rtnetlink) where to
> start? (Particularly, 0 for an "invisible root" that contains all
> tables.)

I can't say I understand these flags, but IMHO the main point is we
should respect them as separate, even if mostly unused and look like
unnecessary. (Unless there is really no other way of fixing this
genetlink bug.) If it were undocumented... but after all this the RFC.

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 19:24                         ` Jarek Poplawski
@ 2011-01-19 19:47                           ` Jan Engelhardt
  2011-01-19 20:12                             ` Jarek Poplawski
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Engelhardt @ 2011-01-19 19:47 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: jamal, Pablo Neira Ayuso, David Miller, arthur.marsh,
	eric.dumazet, netdev


On Wednesday 2011-01-19 20:24, Jarek Poplawski wrote:
>> On Wednesday 2011-01-19 17:54, Jarek Poplawski wrote:
>> 
>> It looks like the authors' intentinos were to make NLM_F_MATCH not
>> stop after a single entry has been found. So that sounds like dump,
>> ok.
>> 
>> But NLM_F_ROOT does not quite strike me as a dump request. What if I
>> wanted just a single item returned but still start at the root?
>
>Hmm... Does it say about starting at the root?:
>
>"          NLM_F_ROOT     Return the complete table instead of a
>                          single entry."

I was referring to netlink.h which paraphrased that, perhaps
too short:

#define NLM_F_ROOT      0x100   /* specify tree root    */

But the RFC description makes for a better wording: if NLM_F_ROOT is
supposed to return "the complete table", how is it different from
NLM_F_MATCH with a wildcard criteria?

|          NLM_F_MATCH    Return all entries matching criteria passed in
|                         message content.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 19:47                           ` Jan Engelhardt
@ 2011-01-19 20:12                             ` Jarek Poplawski
  0 siblings, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-19 20:12 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: jamal, Pablo Neira Ayuso, David Miller, arthur.marsh,
	eric.dumazet, netdev

On Wed, Jan 19, 2011 at 08:47:32PM +0100, Jan Engelhardt wrote:
> 
> On Wednesday 2011-01-19 20:24, Jarek Poplawski wrote:
> >> On Wednesday 2011-01-19 17:54, Jarek Poplawski wrote:
> >> 
> >> It looks like the authors' intentinos were to make NLM_F_MATCH not
> >> stop after a single entry has been found. So that sounds like dump,
> >> ok.
> >> 
> >> But NLM_F_ROOT does not quite strike me as a dump request. What if I
> >> wanted just a single item returned but still start at the root?
> >
> >Hmm... Does it say about starting at the root?:
> >
> >"          NLM_F_ROOT     Return the complete table instead of a
> >                          single entry."
> 
> I was referring to netlink.h which paraphrased that, perhaps
> too short:
> 
> #define NLM_F_ROOT      0x100   /* specify tree root    */
> 
> But the RFC description makes for a better wording: if NLM_F_ROOT is
> supposed to return "the complete table", how is it different from
> NLM_F_MATCH with a wildcard criteria?
> 
> |          NLM_F_MATCH    Return all entries matching criteria passed in
> |                         message content.

As I said, I'd prefer not to pretend I understand it, but, knowing
names of people around this, I'm also quite sure there was a purpose.
On the other hand, I'm not sure the names of flags and descriptions
weren't mixed while making it general for different subsystems.

BTW, don't we have in ip/tc many examples of duplicate options?

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-19 17:42                   ` Pablo Neira Ayuso
@ 2011-01-19 21:34                     ` David Miller
  0 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2011-01-19 21:34 UTC (permalink / raw)
  To: pablo; +Cc: jarkao2, arthur.marsh, jengelh, eric.dumazet, netdev, hadi

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 19 Jan 2011 18:42:13 +0100

> Please, revert this patch until we find a better solution.

Ok, done.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 19:26         ` Alessandro Suardi
@ 2011-01-18 20:07           ` Jarek Poplawski
  0 siblings, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 20:07 UTC (permalink / raw)
  To: Alessandro Suardi
  Cc: Jan Engelhardt, jamal, David Miller, pablo, arthur.marsh,
	eric.dumazet, netdev

On Tue, Jan 18, 2011 at 08:26:42PM +0100, Alessandro Suardi wrote:
> On Tue, Jan 18, 2011 at 7:47 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > On Tue, Jan 18, 2011 at 07:28:52PM +0100, Jarek Poplawski wrote:
> >> On Tue, Jan 18, 2011 at 07:24:40PM +0100, Jan Engelhardt wrote:
> >> >
> >> > On Tuesday 2011-01-18 19:10, Alessandro Suardi wrote:
> >> > >On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> >> > >>
> >> > >> NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
> >> > >> tests message type to verify this. Since genetlink can't do the same
> >> > >> use "practical" test for ops->dumpit (assuming NEW request won't be
> >> > >> mixed with GET).
> >> ...
> >> > >2.6.37-git18 + netlink revert + this patch
> >> > > - fixes Avahi
> >> > > - breaks acpid
> >> > >Starting acpi daemon: RTNETLINK1 answers: Operation not supported
> >> > >acpid: error talking to the kernel via netlink
> >> >
> >> > Deducing from that, it is a GET-like request that was sent by acpid,
> >> > and the message type is one that has both a dumpit and a doit function.
> >> > So if EOPNOTSUPP now occurs on all message types that have both dumpit
> >> > and doit, you should have broken a lot more than just acpid.
> >>
> >> Right, we need something better here.
> >
> > On the other hand, until there is something better, we might try to
> > fix it at least for "normal" dumpit cases?
> >
> > Alessandro, could you try (with the netlink revert)?
...
> 2.6.37-git18 + netlink revert + this 2nd attempt
> 
>  appears to be good for me - both Avahi and acpid start up fine and I
>  can't see any other program misbehaving.
> 
> 
Alessandro, thanks for testing!

Jan, feel free to NAK if it can't help for your problem.

Jarek P.
---------------->
[PATCH v2] netlink: Fix possible NLM_F_DUMP misuse in genetlink

NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
tests message type to verify this. Since genetlink can't do the same
use "practical" test for ops->dumpit, assuming NEW request won't be
mixed with GET. Otherwise, it should work old way. Since, as reported
by Alessandro, apps like acpid use messages with ops->dumpit without
NLM_F_DUMP flags, there is no error reporting for this case.

Original patch by: Jan Engelhardt <jengelh@medozas.de>

Tested-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jamal Hadi Salim <hadi@cyberus.ca>
---

diff -Nurp a/net/netlink/genetlink.c b/net/netlink/genetlink.c
--- a/net/netlink/genetlink.c	2011-01-18 16:58:16.000000000 +0100
+++ b/net/netlink/genetlink.c	2011-01-18 19:36:25.000000000 +0100
@@ -519,15 +519,14 @@ static int genl_rcv_msg(struct sk_buff *
 	    security_netlink_recv(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (nlh->nlmsg_flags & NLM_F_DUMP) {
-		if (ops->dumpit == NULL)
-			return -EOPNOTSUPP;
-
-		genl_unlock();
-		err = netlink_dump_start(net->genl_sock, skb, nlh,
-					 ops->dumpit, ops->done);
-		genl_lock();
-		return err;
+	if (ops->dumpit) {
+		if (nlh->nlmsg_flags & NLM_F_DUMP) {
+			genl_unlock();
+			err = netlink_dump_start(net->genl_sock, skb, nlh,
+						 ops->dumpit, ops->done);
+			genl_lock();
+			return err;
+		}
 	}
 
 	if (ops->doit == NULL)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 18:47       ` Jarek Poplawski
@ 2011-01-18 19:26         ` Alessandro Suardi
  2011-01-18 20:07           ` Jarek Poplawski
  0 siblings, 1 reply; 39+ messages in thread
From: Alessandro Suardi @ 2011-01-18 19:26 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Jan Engelhardt, jamal, David Miller, pablo, arthur.marsh,
	eric.dumazet, netdev

On Tue, Jan 18, 2011 at 7:47 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Tue, Jan 18, 2011 at 07:28:52PM +0100, Jarek Poplawski wrote:
>> On Tue, Jan 18, 2011 at 07:24:40PM +0100, Jan Engelhardt wrote:
>> >
>> > On Tuesday 2011-01-18 19:10, Alessandro Suardi wrote:
>> > >On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
>> > >>
>> > >> NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
>> > >> tests message type to verify this. Since genetlink can't do the same
>> > >> use "practical" test for ops->dumpit (assuming NEW request won't be
>> > >> mixed with GET).
>> ...
>> > >2.6.37-git18 + netlink revert + this patch
>> > > - fixes Avahi
>> > > - breaks acpid
>> > >Starting acpi daemon: RTNETLINK1 answers: Operation not supported
>> > >acpid: error talking to the kernel via netlink
>> >
>> > Deducing from that, it is a GET-like request that was sent by acpid,
>> > and the message type is one that has both a dumpit and a doit function.
>> > So if EOPNOTSUPP now occurs on all message types that have both dumpit
>> > and doit, you should have broken a lot more than just acpid.
>>
>> Right, we need something better here.
>
> On the other hand, until there is something better, we might try to
> fix it at least for "normal" dumpit cases?
>
> Alessandro, could you try (with the netlink revert)?
>
> Thanks,
> Jarek P.
>
> ---
> diff -Nurp a/net/netlink/genetlink.c b/net/netlink/genetlink.c
> --- a/net/netlink/genetlink.c   2011-01-18 16:58:16.000000000 +0100
> +++ b/net/netlink/genetlink.c   2011-01-18 19:36:25.000000000 +0100
> @@ -519,15 +519,14 @@ static int genl_rcv_msg(struct sk_buff *
>            security_netlink_recv(skb, CAP_NET_ADMIN))
>                return -EPERM;
>
> -       if (nlh->nlmsg_flags & NLM_F_DUMP) {
> -               if (ops->dumpit == NULL)
> -                       return -EOPNOTSUPP;
> -
> -               genl_unlock();
> -               err = netlink_dump_start(net->genl_sock, skb, nlh,
> -                                        ops->dumpit, ops->done);
> -               genl_lock();
> -               return err;
> +       if (ops->dumpit) {
> +               if (nlh->nlmsg_flags & NLM_F_DUMP) {
> +                       genl_unlock();
> +                       err = netlink_dump_start(net->genl_sock, skb, nlh,
> +                                                ops->dumpit, ops->done);
> +                       genl_lock();
> +                       return err;
> +               }
>        }
>
>        if (ops->doit == NULL)
>

Sure enough :)


2.6.37-git18 + netlink revert + this 2nd attempt

 appears to be good for me - both Avahi and acpid start up fine and I
 can't see any other program misbehaving.


Thanks, ciao,

--alessandro

 "There's always a siren singing you to shipwreck"

   (Radiohead, "There There")

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 18:28     ` Jarek Poplawski
@ 2011-01-18 18:47       ` Jarek Poplawski
  2011-01-18 19:26         ` Alessandro Suardi
  0 siblings, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 18:47 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Alessandro Suardi, jamal, David Miller, pablo, arthur.marsh,
	eric.dumazet, netdev

On Tue, Jan 18, 2011 at 07:28:52PM +0100, Jarek Poplawski wrote:
> On Tue, Jan 18, 2011 at 07:24:40PM +0100, Jan Engelhardt wrote:
> > 
> > On Tuesday 2011-01-18 19:10, Alessandro Suardi wrote:
> > >On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > >>
> > >> NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
> > >> tests message type to verify this. Since genetlink can't do the same
> > >> use "practical" test for ops->dumpit (assuming NEW request won't be
> > >> mixed with GET).
> ...
> > >2.6.37-git18 + netlink revert + this patch
> > > - fixes Avahi
> > > - breaks acpid
> > >Starting acpi daemon: RTNETLINK1 answers: Operation not supported
> > >acpid: error talking to the kernel via netlink
> > 
> > Deducing from that, it is a GET-like request that was sent by acpid, 
> > and the message type is one that has both a dumpit and a doit function.
> > So if EOPNOTSUPP now occurs on all message types that have both dumpit 
> > and doit, you should have broken a lot more than just acpid.
> 
> Right, we need something better here.

On the other hand, until there is something better, we might try to
fix it at least for "normal" dumpit cases?

Alessandro, could you try (with the netlink revert)?

Thanks,
Jarek P.

---
diff -Nurp a/net/netlink/genetlink.c b/net/netlink/genetlink.c
--- a/net/netlink/genetlink.c	2011-01-18 16:58:16.000000000 +0100
+++ b/net/netlink/genetlink.c	2011-01-18 19:36:25.000000000 +0100
@@ -519,15 +519,14 @@ static int genl_rcv_msg(struct sk_buff *
 	    security_netlink_recv(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (nlh->nlmsg_flags & NLM_F_DUMP) {
-		if (ops->dumpit == NULL)
-			return -EOPNOTSUPP;
-
-		genl_unlock();
-		err = netlink_dump_start(net->genl_sock, skb, nlh,
-					 ops->dumpit, ops->done);
-		genl_lock();
-		return err;
+	if (ops->dumpit) {
+		if (nlh->nlmsg_flags & NLM_F_DUMP) {
+			genl_unlock();
+			err = netlink_dump_start(net->genl_sock, skb, nlh,
+						 ops->dumpit, ops->done);
+			genl_lock();
+			return err;
+		}
 	}
 
 	if (ops->doit == NULL)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 18:24   ` Jan Engelhardt
@ 2011-01-18 18:28     ` Jarek Poplawski
  2011-01-18 18:47       ` Jarek Poplawski
  0 siblings, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 18:28 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Alessandro Suardi, jamal, David Miller, pablo, arthur.marsh,
	eric.dumazet, netdev

On Tue, Jan 18, 2011 at 07:24:40PM +0100, Jan Engelhardt wrote:
> 
> On Tuesday 2011-01-18 19:10, Alessandro Suardi wrote:
> >On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> >>
> >> NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
> >> tests message type to verify this. Since genetlink can't do the same
> >> use "practical" test for ops->dumpit (assuming NEW request won't be
> >> mixed with GET).
...
> >2.6.37-git18 + netlink revert + this patch
> > - fixes Avahi
> > - breaks acpid
> >Starting acpi daemon: RTNETLINK1 answers: Operation not supported
> >acpid: error talking to the kernel via netlink
> 
> Deducing from that, it is a GET-like request that was sent by acpid, 
> and the message type is one that has both a dumpit and a doit function.
> So if EOPNOTSUPP now occurs on all message types that have both dumpit 
> and doit, you should have broken a lot more than just acpid.

Right, we need something better here.

Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 18:10 ` Alessandro Suardi
  2011-01-18 18:23   ` Jarek Poplawski
@ 2011-01-18 18:24   ` Jan Engelhardt
  2011-01-18 18:28     ` Jarek Poplawski
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Engelhardt @ 2011-01-18 18:24 UTC (permalink / raw)
  To: Alessandro Suardi
  Cc: Jarek Poplawski, jamal, David Miller, pablo, arthur.marsh,
	eric.dumazet, netdev


On Tuesday 2011-01-18 19:10, Alessandro Suardi wrote:
>On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
>>
>> NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
>> tests message type to verify this. Since genetlink can't do the same
>> use "practical" test for ops->dumpit (assuming NEW request won't be
>> mixed with GET).
>>
>> -       if (nlh->nlmsg_flags & NLM_F_DUMP) {
>> -               if (ops->dumpit == NULL)
>> +       if (ops->dumpit) {
>> +               if (nlh->nlmsg_flags & NLM_F_DUMP) {
>> +                       genl_unlock();
>> +                       err = netlink_dump_start(net->genl_sock, skb, nlh,
>> +                                                ops->dumpit, ops->done);
>> +                       genl_lock();
>> +                       return err;
>> +               } else {
>>                        return -EOPNOTSUPP;
>> -
>> -               genl_unlock();
>> -               err = netlink_dump_start(net->genl_sock, skb, nlh,
>> -                                        ops->dumpit, ops->done);
>> -               genl_lock();
>> -               return err;
>> +               }
>
>2.6.37-git18 + netlink revert + this patch
> - fixes Avahi
> - breaks acpid
>Starting acpi daemon: RTNETLINK1 answers: Operation not supported
>acpid: error talking to the kernel via netlink

Deducing from that, it is a GET-like request that was sent by acpid, 
and the message type is one that has both a dumpit and a doit function.
So if EOPNOTSUPP now occurs on all message types that have both dumpit 
and doit, you should have broken a lot more than just acpid.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 18:10 ` Alessandro Suardi
@ 2011-01-18 18:23   ` Jarek Poplawski
  2011-01-18 18:24   ` Jan Engelhardt
  1 sibling, 0 replies; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 18:23 UTC (permalink / raw)
  To: Alessandro Suardi
  Cc: jamal, David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev

On Tue, Jan 18, 2011 at 07:10:35PM +0100, Alessandro Suardi wrote:
> On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > [PATCH] netlink: Fix possible NLM_F_DUMP misuse in genetlink
> >
> > NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
> > tests message type to verify this. Since genetlink can't do the same
> > use "practical" test for ops->dumpit (assuming NEW request won't be
> > mixed with GET).
> >
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> > Cc: Jan Engelhardt <jengelh@medozas.de>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Cc: Jamal Hadi Salim <hadi@cyberus.ca>
> > ---
> > Not for stable before testing!
> >
> > diff -Nurp a/net/netlink/genetlink.c b/net/netlink/genetlink.c
> > --- a/net/netlink/genetlink.c   2011-01-18 16:58:16.000000000 +0100
> > +++ b/net/netlink/genetlink.c   2011-01-18 17:08:43.000000000 +0100
> > @@ -519,15 +519,16 @@ static int genl_rcv_msg(struct sk_buff *
...
> 2.6.37-git18 + netlink revert + this patch
>  - fixes Avahi
>  - breaks acpid

Well, then fixing genetlink needs more than this and I withdraw this
patch. Anyway, netlink revert is IMHO needed to prevent the regression.

...
> If more debugging/testing is needed, do ping me. Thanks,

Many thanks for such a fast response!
Jarek P.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
  2011-01-18 17:23 Jarek Poplawski
@ 2011-01-18 18:10 ` Alessandro Suardi
  2011-01-18 18:23   ` Jarek Poplawski
  2011-01-18 18:24   ` Jan Engelhardt
  0 siblings, 2 replies; 39+ messages in thread
From: Alessandro Suardi @ 2011-01-18 18:10 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: jamal, David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev

On Tue, Jan 18, 2011 at 6:23 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> [PATCH] netlink: Fix possible NLM_F_DUMP misuse in genetlink
>
> NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
> tests message type to verify this. Since genetlink can't do the same
> use "practical" test for ops->dumpit (assuming NEW request won't be
> mixed with GET).
>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> Cc: Jan Engelhardt <jengelh@medozas.de>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Jamal Hadi Salim <hadi@cyberus.ca>
> ---
> Not for stable before testing!
>
> diff -Nurp a/net/netlink/genetlink.c b/net/netlink/genetlink.c
> --- a/net/netlink/genetlink.c   2011-01-18 16:58:16.000000000 +0100
> +++ b/net/netlink/genetlink.c   2011-01-18 17:08:43.000000000 +0100
> @@ -519,15 +519,16 @@ static int genl_rcv_msg(struct sk_buff *
>            security_netlink_recv(skb, CAP_NET_ADMIN))
>                return -EPERM;
>
> -       if (nlh->nlmsg_flags & NLM_F_DUMP) {
> -               if (ops->dumpit == NULL)
> +       if (ops->dumpit) {
> +               if (nlh->nlmsg_flags & NLM_F_DUMP) {
> +                       genl_unlock();
> +                       err = netlink_dump_start(net->genl_sock, skb, nlh,
> +                                                ops->dumpit, ops->done);
> +                       genl_lock();
> +                       return err;
> +               } else {
>                        return -EOPNOTSUPP;
> -
> -               genl_unlock();
> -               err = netlink_dump_start(net->genl_sock, skb, nlh,
> -                                        ops->dumpit, ops->done);
> -               genl_lock();
> -               return err;
> +               }
>        }
>
>        if (ops->doit == NULL)
>

2.6.37-git18 + netlink revert + this patch
 - fixes Avahi
 - breaks acpid

Upon startup I have:

Starting acpi daemon: RTNETLINK1 answers: Operation not supported
acpid: error talking to the kernel via netlink




From strace output:

open("/dev/input/event6", O_RDONLY|O_NONBLOCK) = 8
fcntl(8, F_SETFD, FD_CLOEXEC)           = 0
ioctl(8, 0x80604520, 0x7fffb3418550)    = 8
ioctl(8, 0x80604521, 0x7fffb34185b0)    = 96
open("/dev/input/event8", O_RDONLY|O_NONBLOCK) = 9
fcntl(9, F_SETFD, FD_CLOEXEC)           = 0
ioctl(9, 0x80604520, 0x7fffb3418550)    = 8
ioctl(9, 0x80604532, 0x7fffb3418c10)    = 8
close(9)                                = 0
inotify_init()                          = 9
inotify_add_watch(9, "/dev/input", IN_CREATE) = 1
socket(PF_NETLINK, SOCK_RAW, 16)        = 10
setsockopt(10, SOL_SOCKET, SO_SNDBUF, [32768], 4) = 0
setsockopt(10, SOL_SOCKET, SO_RCVBUF, [32768], 4) = 0
bind(10, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(10, {sa_family=AF_NETLINK, pid=4298, groups=00000000}, [12]) = 0
sendmsg(10, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
groups=00000000},
msg_iov(1)=[{"$\0\0\0\20\0\5\0\346\3265M\0\0\0\0\3\0\0\0\17\0\2\0acpi_eve"...,
36}], msg_controllen=0, msg_flags=0}, 0) = 36
recvmsg(10, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
groups=00000000},
msg_iov(1)=[{"8\0\0\0\2\0\0\0\346\3265M\312\20\0\0\241\377\377\377$\0\0\0\20\0\5\0\346\3265M"...,
16384}], msg_controllen=0, msg_flags=0}, 0) = 56
dup(2)                                  = 11
fcntl(11, F_GETFL)                      = 0x1 (flags O_WRONLY)
close(11)                               = 0
write(2, "RTNETLINK1 answers: Operation no"..., 44RTNETLINK1 answers:
Operation not supported
) = 44
write(2, "acpid: error talking to the kern"..., 47acpid: error talking
to the kernel via netlink
) = 47
close(10)                               = 0
socket(PF_NETLINK, SOCK_RAW, 16)        = 10
setsockopt(10, SOL_SOCKET, SO_SNDBUF, [32768], 4) = 0
setsockopt(10, SOL_SOCKET, SO_RCVBUF, [32768], 4) = 0
bind(10, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(10, {sa_family=AF_NETLINK, pid=4298, groups=00000000}, [12]) = 0
unlink("/var/run/acpid.socket")         = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 11
bind(11, {sa_family=AF_FILE, path="/var/run/acpid.socket"}, 110) = 0
listen(11, 10)                          = 0



If more debugging/testing is needed, do ping me. Thanks,

--alessandro

 "There's always a siren singing you to shipwreck"

   (Radiohead, "There There")

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied
@ 2011-01-18 17:23 Jarek Poplawski
  2011-01-18 18:10 ` Alessandro Suardi
  0 siblings, 1 reply; 39+ messages in thread
From: Jarek Poplawski @ 2011-01-18 17:23 UTC (permalink / raw)
  To: jamal
  Cc: David Miller, pablo, arthur.marsh, jengelh, eric.dumazet, netdev,
	Alessandro Suardi

[PATCH] netlink: Fix possible NLM_F_DUMP misuse in genetlink

NLM_F_DUMP flags should be applied to GET requests only, eg. rtnetlink
tests message type to verify this. Since genetlink can't do the same
use "practical" test for ops->dumpit (assuming NEW request won't be
mixed with GET).

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jamal Hadi Salim <hadi@cyberus.ca>
---
Not for stable before testing!

diff -Nurp a/net/netlink/genetlink.c b/net/netlink/genetlink.c
--- a/net/netlink/genetlink.c	2011-01-18 16:58:16.000000000 +0100
+++ b/net/netlink/genetlink.c	2011-01-18 17:08:43.000000000 +0100
@@ -519,15 +519,16 @@ static int genl_rcv_msg(struct sk_buff *
 	    security_netlink_recv(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (nlh->nlmsg_flags & NLM_F_DUMP) {
-		if (ops->dumpit == NULL)
+	if (ops->dumpit) {
+		if (nlh->nlmsg_flags & NLM_F_DUMP) {
+			genl_unlock();
+			err = netlink_dump_start(net->genl_sock, skb, nlh,
+						 ops->dumpit, ops->done);
+			genl_lock();
+			return err;
+		} else {
 			return -EOPNOTSUPP;
-
-		genl_unlock();
-		err = netlink_dump_start(net->genl_sock, skb, nlh,
-					 ops->dumpit, ops->done);
-		genl_lock();
-		return err;
+		}
 	}
 
 	if (ops->doit == NULL)

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2011-01-19 21:33 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-16  8:54 inbound connection problems when "netlink: test for all flags of the NLM_F_DUMP composite" commit applied Arthur Marsh
2011-01-16  9:21 ` Eric Dumazet
2011-01-16 10:50   ` Jan Engelhardt
2011-01-16 12:39     ` Arthur Marsh
     [not found]     ` <4D32E3BA.5040008@internode.on.net>
2011-01-16 21:17       ` Pablo Neira Ayuso
2011-01-17  1:03         ` Arthur Marsh
2011-01-18  9:38         ` Jarek Poplawski
2011-01-18 10:07           ` David Miller
2011-01-18 10:24             ` Jarek Poplawski
2011-01-18 14:05               ` jamal
2011-01-18 14:07                 ` jamal
2011-01-18 17:22                   ` Jarek Poplawski
2011-01-18 18:11                 ` Jarek Poplawski
2011-01-18 20:39                 ` David Miller
2011-01-18 20:31               ` Pablo Neira Ayuso
2011-01-18 20:50                 ` David Miller
2011-01-19 17:42                   ` Pablo Neira Ayuso
2011-01-19 21:34                     ` David Miller
2011-01-18 20:55                 ` Jarek Poplawski
2011-01-19 14:28                   ` jamal
2011-01-19 16:54                     ` Jarek Poplawski
2011-01-19 16:59                       ` jamal
2011-01-19 17:19                         ` Jarek Poplawski
2011-01-19 17:33                         ` Jarek Poplawski
2011-01-19 18:04                       ` Jan Engelhardt
2011-01-19 19:24                         ` Jarek Poplawski
2011-01-19 19:47                           ` Jan Engelhardt
2011-01-19 20:12                             ` Jarek Poplawski
2011-01-18 21:14                 ` Jarek Poplawski
2011-01-19 14:53                   ` Pablo Neira Ayuso
2011-01-19 16:18                     ` Jarek Poplawski
2011-01-18 17:23 Jarek Poplawski
2011-01-18 18:10 ` Alessandro Suardi
2011-01-18 18:23   ` Jarek Poplawski
2011-01-18 18:24   ` Jan Engelhardt
2011-01-18 18:28     ` Jarek Poplawski
2011-01-18 18:47       ` Jarek Poplawski
2011-01-18 19:26         ` Alessandro Suardi
2011-01-18 20:07           ` Jarek Poplawski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.