All of lore.kernel.org
 help / color / mirror / Atom feed
* What does nflog_unbind_pf actually do?
@ 2011-01-25 12:54 Helmut Grohne
  2011-02-03 12:00 ` Helmut Grohne
  0 siblings, 1 reply; 8+ messages in thread
From: Helmut Grohne @ 2011-01-25 12:54 UTC (permalink / raw)
  To: netfilter

Hi,

I was wondering what nflog_unbind_pf actually does. The doxygen comment
suggests it to be a harmless setup function acting on a given handle:

libnetfilter-log src/libnetfilter_log.c:
| /**
|  * nflog_unbind_pf - unbind nflog handler from a protocol family
|  * \param h Netfilter log handle obtained via call to nflog_open()
|  * \param pf protocol family to unbind family from
|  *
|  * Unbinds the given nflog handle from processing packets belonging
|  * to the given protocol family.
|  */

However the example suggests that the command indeed is not as harmless:

libnetfilter-log util/nfulnl_test.c:
| #ifdef INSANE
|         /* norally, applications SHOULD NOT issue this command,
|          * since it detaches other programs/sockets from AF_INET, too ! */
|         printf("unbinding from AF_INET\n");
|         nflog_unbind_pf(h, AF_INET);
| #endif

So far so good, but why does util/nfulnl_test.c call nflog_unbind_pf in the
setup code then?

Trying to find out what it actually does I dug into the kernel and discovered
that nf_log_unbind_pf in fact does not operate on a handle but on some global
state! (See linux net/netfilter/nf_log.c) Still I have no idea what it is
supposed to do.

As a result I experimented a bit to see what happens. Leaving out the
nflog_unbind_pf in util/nfulnl_test.c results in the nflog_bind_pf to
fail. I'd attribute this to some double binding. Removing both
nflog_unbind_pf and nflog_bind_pf simply results in no packets being
received at all.

Why am I interested in this you may ask. I am trying to start multiple
logging daemons, one for each nflog group. The rationale behind this
design is that the kernel will not report packets for multiple groups in
one recv from the netlink socket. Processing multiple groups in one
daemon therefore has no benefit when it comes to reducing system calls.
Using multiple daemons however can distribute the load to multiple CPUs
which is a clear benefit. (Note that threads are not an option, because
the library is not thread safe.) Now when I start multiple daemons
simultaneously they randomly fail and the culprit seems to be the
interference of the pf binding and unbinding calls.

Helmut

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What does nflog_unbind_pf actually do?
  2011-01-25 12:54 What does nflog_unbind_pf actually do? Helmut Grohne
@ 2011-02-03 12:00 ` Helmut Grohne
  2011-02-03 13:27   ` Pablo Neira Ayuso
  0 siblings, 1 reply; 8+ messages in thread
From: Helmut Grohne @ 2011-02-03 12:00 UTC (permalink / raw)
  To: netfilter

Thanks to Florian Westphal (fw on Freenode) for helping me sort this
out.

On Tue, Jan 25, 2011 at 01:54:27PM +0100, Helmut Grohne wrote:
> I was wondering what nflog_unbind_pf actually does. The doxygen comment
> suggests it to be a harmless setup function acting on a given handle:
> 
> libnetfilter-log src/libnetfilter_log.c:
> | /**
> |  * nflog_unbind_pf - unbind nflog handler from a protocol family
> |  * \param h Netfilter log handle obtained via call to nflog_open()
> |  * \param pf protocol family to unbind family from
> |  *
> |  * Unbinds the given nflog handle from processing packets belonging
> |  * to the given protocol family.
> |  */

This comment is indeed very misleading. Actually the passed handle plays
no role in the modification apart from providing access. The NFLOG
iptables target has different ways to log packets. Currently the only
logger is netlink. The state can be observed by examining
/proc/net/netfilter/nf_log. This file maps protocol numbers to loggers.
So nflog_{,un}bind_pf really modifies a global and persistent kernel
data structure. The default logger is "NONE" or "NULL" which means no
logging, so it has to be set up once. Trying to do so in parallel will
result in race conditions.

Furthermore I'd like to remark that if you handle lots of packets the in
kernel buffer might be too small. This can result in packets being
dropped which is signaled by ENOBUFS being returned from recv. The
socket can be used normally after this error. To avoid this situation
the receive buffer size can be increased using setsockopt
SO_RCVBUFFORCE.

Helmut

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What does nflog_unbind_pf actually do?
  2011-02-03 12:00 ` Helmut Grohne
@ 2011-02-03 13:27   ` Pablo Neira Ayuso
  2011-02-03 17:24     ` Helmut Grohne
  0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2011-02-03 13:27 UTC (permalink / raw)
  To: Helmut Grohne; +Cc: netfilter

On 03/02/11 13:00, Helmut Grohne wrote:
> Thanks to Florian Westphal (fw on Freenode) for helping me sort this
> out.
> 
> On Tue, Jan 25, 2011 at 01:54:27PM +0100, Helmut Grohne wrote:
>> I was wondering what nflog_unbind_pf actually does. The doxygen comment
>> suggests it to be a harmless setup function acting on a given handle:
>>
>> libnetfilter-log src/libnetfilter_log.c:
>> | /**
>> |  * nflog_unbind_pf - unbind nflog handler from a protocol family
>> |  * \param h Netfilter log handle obtained via call to nflog_open()
>> |  * \param pf protocol family to unbind family from
>> |  *
>> |  * Unbinds the given nflog handle from processing packets belonging
>> |  * to the given protocol family.
>> |  */
> 
> This comment is indeed very misleading.

Let's fix it then :-)

> Actually the passed handle plays
> no role in the modification apart from providing access. The NFLOG
> iptables target has different ways to log packets. Currently the only
> logger is netlink. The state can be observed by examining
> /proc/net/netfilter/nf_log. This file maps protocol numbers to loggers.
> So nflog_{,un}bind_pf really modifies a global and persistent kernel
> data structure. The default logger is "NONE" or "NULL" which means no
> logging, so it has to be set up once. Trying to do so in parallel will
> result in race conditions.

Please, would you send me a patch so others can benefit for this
conclusion in the official documentation? I'd appreciate it.

> Furthermore I'd like to remark that if you handle lots of packets the in
> kernel buffer might be too small. This can result in packets being
> dropped which is signaled by ENOBUFS being returned from recv. The
> socket can be used normally after this error. To avoid this situation
> the receive buffer size can be increased using setsockopt
> SO_RCVBUFFORCE.

There is other things that you can do to avoid ENOBUFS, it is documented
in libnetfilter_queue but it also applies to libnetfilter_log:

http://www.netfilter.org/projects/libnetfilter_queue/doxygen/

See performance, the last two items do not apply to libnetfilter_log.
Another patch for this would be great.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What does nflog_unbind_pf actually do?
  2011-02-03 13:27   ` Pablo Neira Ayuso
@ 2011-02-03 17:24     ` Helmut Grohne
  2011-02-04  9:56       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 8+ messages in thread
From: Helmut Grohne @ 2011-02-03 17:24 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

On Thu, Feb 03, 2011 at 02:27:28PM +0100, Pablo Neira Ayuso wrote:
> On 03/02/11 13:00, Helmut Grohne wrote:
> > On Tue, Jan 25, 2011 at 01:54:27PM +0100, Helmut Grohne wrote:
> >> libnetfilter-log src/libnetfilter_log.c:
> >> | /**
> >> |  * nflog_unbind_pf - unbind nflog handler from a protocol family
> >> |  * \param h Netfilter log handle obtained via call to nflog_open()
> >> |  * \param pf protocol family to unbind family from
> >> |  *
> >> |  * Unbinds the given nflog handle from processing packets belonging
> >> |  * to the given protocol family.
> >> |  */
> > 
> > This comment is indeed very misleading.
> 
> Let's fix it then :-)
...
> Please, would you send me a patch so others can benefit for this
> conclusion in the official documentation? I'd appreciate it.

I just touched the surface of what is going on here. It would certainly
be better if someone with more oversight would try to fill in this gap.

> There is other things that you can do to avoid ENOBUFS, it is documented
> in libnetfilter_queue but it also applies to libnetfilter_log:
> 
> http://www.netfilter.org/projects/libnetfilter_queue/doxygen/
> 
> See performance, the last two items do not apply to libnetfilter_log.
> Another patch for this would be great.

Thanks for the information. Again I'd need more understanding of the
matter before I can create a patch. (For instance I fail to see why
suppressing ENOBUFS would not help the performane.) The performance
issues on my project seem to have solved themselves in the mean time.

Helmut

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What does nflog_unbind_pf actually do?
  2011-02-03 17:24     ` Helmut Grohne
@ 2011-02-04  9:56       ` Pablo Neira Ayuso
  2011-02-10  8:52         ` Helmut Grohne
  0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2011-02-04  9:56 UTC (permalink / raw)
  To: netfilter; +Cc: Helmut Grohne

On 03/02/11 18:24, Helmut Grohne wrote:
> On Thu, Feb 03, 2011 at 02:27:28PM +0100, Pablo Neira Ayuso wrote:
>> On 03/02/11 13:00, Helmut Grohne wrote:
>>> On Tue, Jan 25, 2011 at 01:54:27PM +0100, Helmut Grohne wrote:
>>>> libnetfilter-log src/libnetfilter_log.c:
>>>> | /**
>>>> |  * nflog_unbind_pf - unbind nflog handler from a protocol family
>>>> |  * \param h Netfilter log handle obtained via call to nflog_open()
>>>> |  * \param pf protocol family to unbind family from
>>>> |  *
>>>> |  * Unbinds the given nflog handle from processing packets belonging
>>>> |  * to the given protocol family.
>>>> |  */
>>>
>>> This comment is indeed very misleading.
>>
>> Let's fix it then :-)
> ...
>> Please, would you send me a patch so others can benefit for this
>> conclusion in the official documentation? I'd appreciate it.
> 
> I just touched the surface of what is going on here. It would certainly
> be better if someone with more oversight would try to fill in this gap.

If you send a patch, I can complete/mangle it to include and to correct
imprecise information ;-)

>> There is other things that you can do to avoid ENOBUFS, it is documented
>> in libnetfilter_queue but it also applies to libnetfilter_log:
>>
>> http://www.netfilter.org/projects/libnetfilter_queue/doxygen/
>>
>> See performance, the last two items do not apply to libnetfilter_log.
>> Another patch for this would be great.
> 
> Thanks for the information. Again I'd need more understanding of the
> matter before I can create a patch. (For instance I fail to see why
> suppressing ENOBUFS would not help the performane.) The performance
> issues on my project seem to have solved themselves in the mean time.

ENOBUFS means that your application is too slow to handle the amount of
messages that the kernel sends to user-space. Basically, the socket
queue between kernel and user-space gets full and the kernel starts
dropping messages.

ENOBUFS is a way to know that you're losing messages. In the particular
case of logging, it's telling that your logging has become not fully
reliable at some point (because some log lines will be missing).

In the case of ctnetlink and nfqueue, the general interpretation of
ENOBUFS is the same, but the action that user-space has to do to handle
the situation is different.

Increasing the socket queue (buffer) helps in the case of ENOBUFS, but
under high stress, increasing indefinitely the buffer size is not the
way to go.

Hm, I think I need to write some article on ENOBUFS and netlink.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What does nflog_unbind_pf actually do?
  2011-02-04  9:56       ` Pablo Neira Ayuso
@ 2011-02-10  8:52         ` Helmut Grohne
  2011-02-11 14:29           ` Pablo Neira Ayuso
  0 siblings, 1 reply; 8+ messages in thread
From: Helmut Grohne @ 2011-02-10  8:52 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter

Hi Pablo,

On Fri, Feb 04, 2011 at 10:56:09AM +0100, Pablo Neira Ayuso wrote:
> On 03/02/11 18:24, Helmut Grohne wrote:
> > Thanks for the information. Again I'd need more understanding of the
> > matter before I can create a patch. (For instance I fail to see why
> > suppressing ENOBUFS would not help the performane.) The performance
> > issues on my project seem to have solved themselves in the mean time.

Rereading what I wrote I figured that the sentence in braces has a
spurious negation. What I really meant was: For instance I fail to see
why suppressing ENOBUFS would help/improve the performance.

> ENOBUFS means that your application is too slow to handle the amount of
> messages that the kernel sends to user-space. Basically, the socket
> queue between kernel and user-space gets full and the kernel starts
> dropping messages.
> 
> ENOBUFS is a way to know that you're losing messages. In the particular
> case of logging, it's telling that your logging has become not fully
> reliable at some point (because some log lines will be missing).
> 
> In the case of ctnetlink and nfqueue, the general interpretation of
> ENOBUFS is the same, but the action that user-space has to do to handle
> the situation is different.
> 
> Increasing the socket queue (buffer) helps in the case of ENOBUFS, but
> under high stress, increasing indefinitely the buffer size is not the
> way to go.

Yes, I figured the meaning of ENOBUFS and this mitigation with help from
Florian Westphal. Still all your writing does not explain why the
following statement from the Doxygen documentation should be true.

| To improve your libnetfilter_queue application in terms of
| performance, you may consider the following tweaks:
| ...
|  * set NETLINK_NO_ENOBUFS socket option to avoid receiving ENOBUFS
|    errors (requires Linux kernel >= 2.6.30).

I do not need an answer, because my performance issues got solved by
increasing the socket buffer. On the other hand this makes it clear just
how much more knowledge is required to write documentation than I have.

> Hm, I think I need to write some article on ENOBUFS and netlink.

Well for people who never heared about it (like me three weeks ago) your
explanations here are quite helpful. Maybe you should add a note to
explicitly clarify that receiving ENOBUFS does not indicate a permanent
breakage of the file descriptor (in contrast to EBADFD).

Also I do wonder why the manual page for recv(2) does not list ENOBUFS
in the list of possible errors. Since posix[1] seems to specify it, it
looks like a bug in the manual page. *sigh*

Helmut

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/recv.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What does nflog_unbind_pf actually do?
  2011-02-10  8:52         ` Helmut Grohne
@ 2011-02-11 14:29           ` Pablo Neira Ayuso
  2011-02-14 14:31             ` ENOBUFS missing in man recv(2) [Initially: What does nflog_unbind_pf actually do?] Helmut Grohne
  0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2011-02-11 14:29 UTC (permalink / raw)
  To: netfilter

On 10/02/11 09:52, Helmut Grohne wrote:
> Hi Pablo,
> 
> On Fri, Feb 04, 2011 at 10:56:09AM +0100, Pablo Neira Ayuso wrote:
>> On 03/02/11 18:24, Helmut Grohne wrote:
>>> Thanks for the information. Again I'd need more understanding of the
>>> matter before I can create a patch. (For instance I fail to see why
>>> suppressing ENOBUFS would not help the performane.) The performance
>>> issues on my project seem to have solved themselves in the mean time.
> 
> Rereading what I wrote I figured that the sentence in braces has a
> spurious negation. What I really meant was: For instance I fail to see
> why suppressing ENOBUFS would help/improve the performance.

Suppresing ENOBUFS will help for nfqueue but not for nflog. For nfqueue,
we drop network packets if netlink is congested, for that reason,
disabling ENOBUFS is a good idea in that case.

nfqueue drops packet if the application cannot issue a verdict on them.
Otherwise, we may leak packets.

For the nflog case, the packets are not dropped if netlink is congested,
but you miss some log messages.

>> ENOBUFS means that your application is too slow to handle the amount of
>> messages that the kernel sends to user-space. Basically, the socket
>> queue between kernel and user-space gets full and the kernel starts
>> dropping messages.
>>
>> ENOBUFS is a way to know that you're losing messages. In the particular
>> case of logging, it's telling that your logging has become not fully
>> reliable at some point (because some log lines will be missing).
>>
>> In the case of ctnetlink and nfqueue, the general interpretation of
>> ENOBUFS is the same, but the action that user-space has to do to handle
>> the situation is different.
>>
>> Increasing the socket queue (buffer) helps in the case of ENOBUFS, but
>> under high stress, increasing indefinitely the buffer size is not the
>> way to go.
> 
> Yes, I figured the meaning of ENOBUFS and this mitigation with help from
> Florian Westphal. Still all your writing does not explain why the
> following statement from the Doxygen documentation should be true.
> 
> | To improve your libnetfilter_queue application in terms of
> | performance, you may consider the following tweaks:
> | ...
> |  * set NETLINK_NO_ENOBUFS socket option to avoid receiving ENOBUFS
> |    errors (requires Linux kernel >= 2.6.30).
> 
> I do not need an answer, because my performance issues got solved by
> increasing the socket buffer. On the other hand this makes it clear just
> how much more knowledge is required to write documentation than I have.

The answer is above.

>> Hm, I think I need to write some article on ENOBUFS and netlink.
> 
> Well for people who never heared about it (like me three weeks ago) your
> explanations here are quite helpful. Maybe you should add a note to
> explicitly clarify that receiving ENOBUFS does not indicate a permanent
> breakage of the file descriptor (in contrast to EBADFD).

Yes I can do that, but I get less load of work if others send me a patch
for that :-)

And yes, some more documentation on ENOBUFS in the library may come in
handy. But someone would have to do it.

To know more about netlink, you may want to read:
http://1984.lsi.us.es/~pablo/docs/spae.pdf

> Also I do wonder why the manual page for recv(2) does not list ENOBUFS
> in the list of possible errors. Since posix[1] seems to specify it, it
> looks like a bug in the manual page. *sigh*

This is quite netlink specific and nobody probably sent a patch for it
so far. I encourage you to send a patch to the manpage maintainers. This
is how things work, it's up to you to help others to fix this situation.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* ENOBUFS missing in man recv(2) [Initially: What does nflog_unbind_pf actually do?]
  2011-02-11 14:29           ` Pablo Neira Ayuso
@ 2011-02-14 14:31             ` Helmut Grohne
  0 siblings, 0 replies; 8+ messages in thread
From: Helmut Grohne @ 2011-02-14 14:31 UTC (permalink / raw)
  To: Pablo Neira Ayuso, mtk.manpages; +Cc: netfilter, linux-man

On Fri, Feb 11, 2011 at 03:29:36PM +0100, Pablo Neira Ayuso wrote:
> On 10/02/11 09:52, Helmut Grohne wrote:
> > Also I do wonder why the manual page for recv(2) does not list ENOBUFS
> > in the list of possible errors. Since posix[1] seems to specify it, it
> > looks like a bug in the manual page. *sigh*
> 
> This is quite netlink specific and nobody probably sent a patch for it
> so far. I encourage you to send a patch to the manpage maintainers. This
> is how things work, it's up to you to help others to fix this situation.

The reference [1] in my previous mail referenced
http://pubs.opengroup.org/onlinepubs/009695399/functions/recv.html.

Pablo, instead of complaining about missing patches, you could comment on the
other patch I sent in. I do send patches when I am confident that I understood
things. This just happens not to be the netfilter-log library.

Let me propose the addition at the end of this email to the recv(2) manual
page.

Helmut

--- recv.2.orig 2011-02-14 15:05:49.000000000 +0100
+++ recv.2      2011-02-14 15:26:13.000000000 +0100
@@ -425,6 +425,17 @@
 Invalid argument passed.
 .\" e.g., msg_namelen < 0 for recvmsg() or addrlen < 0 for recvfrom()
 .TP
+.B ENOBUFS
+A positive number of messages was dropped due to insufficient socket
+buffer space. On Linux this can occur when operating on netlink
+sockets. The
+.BR SO_RCVBUF
+and
+.BR SO_RCVBUFFORCE
+socket options described in
+.BR socket (7)
+can be used to change the socket buffer size.
+.TP
 .B ENOMEM
 Could not allocate memory for
 .BR recvmsg ().

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-02-14 14:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-25 12:54 What does nflog_unbind_pf actually do? Helmut Grohne
2011-02-03 12:00 ` Helmut Grohne
2011-02-03 13:27   ` Pablo Neira Ayuso
2011-02-03 17:24     ` Helmut Grohne
2011-02-04  9:56       ` Pablo Neira Ayuso
2011-02-10  8:52         ` Helmut Grohne
2011-02-11 14:29           ` Pablo Neira Ayuso
2011-02-14 14:31             ` ENOBUFS missing in man recv(2) [Initially: What does nflog_unbind_pf actually do?] Helmut Grohne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.