All of lore.kernel.org
 help / color / mirror / Atom feed
* : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
       [not found]   ` <MWHPR16MB1502DC55542EBA8121631B16EDB20@MWHPR16MB1502.namprd16.prod.outlook.com>
@ 2018-12-31  6:20     ` Mohandass, Roobesh
  2019-01-07 11:17       ` Florian Westphal
       [not found]     ` <MWHPR16MB1502A336CCB4EF3F0FCB69ECEDB20@MWHPR16MB1502.namprd16.prod.outlook.com>
  1 sibling, 1 reply; 17+ messages in thread
From: Mohandass, Roobesh @ 2018-12-31  6:20 UTC (permalink / raw)
  To: netdev

(email is bouncing due to URL in the body of the message, so using hash instead).

Hi Netdev,

Issue: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

Using below version of,
~# lsb_release -rd
Description: Ubuntu 18.04.1 LTS
Release: 18.04

What you expected to happen:
getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) - should return always destination IP(connected IP).

What happened instead:
getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP. Using getsockname() instead looks like solving the issue.

For just an example: 
Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is just an rough estimate to give you idea on frequency)

Some old reference: (similar behavior observed) 
MD5: 60059ae4a3d912d4745ca66630b9b949   (convert this hash into string for URL).

Side Note: (about how we identified this behavior)
We were using haproxy-version 1.8.14-1 which is using this kernel function getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) to get the destination IP details/connected address details. But instead we are getting very occasional source IP address instead of destination(Connected IP).

Thanks for your attention on this message.

With kind regards,
RGM(Roobesh Mohandass)
Cloud Security Platform

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
       [not found]     ` <MWHPR16MB1502A336CCB4EF3F0FCB69ECEDB20@MWHPR16MB1502.namprd16.prod.outlook.com>
@ 2019-01-07  7:58       ` Mohandass, Roobesh
  0 siblings, 0 replies; 17+ messages in thread
From: Mohandass, Roobesh @ 2019-01-07  7:58 UTC (permalink / raw)
  To: netdev

Hi Netdev,

Any help, much appreciated.

-Roobesh G M

From: Mohandass, Roobesh 
Sent: Monday, December 31, 2018 1:13 PM
To: netdev@vger.kernel.org
Cc: Willy Tarreau <w@1wt.eu>
Subject: RE: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

(email is bouncing due to URL in the body of the message and my signature adjusted, so using hash instead).

Hi Netdev,

Issue: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

Using below version of,
~# lsb_release -rd
Description: Ubuntu 18.04.1 LTS
Release: 18.04

What you expected to happen:
getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) - should return always destination IP(connected IP).

What happened instead:
getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP. Using getsockname() instead looks like solving the issue.

For just an example: 
Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is just an rough estimate to give you idea on frequency)

Some old reference: (similar behavior observed) 
Base64 encode: aHR0cHM6Ly9zdGFja292ZXJmbG93LmNvbS9xdWVzdGlvbnMvMTE0MTcxODcvZ2V0c29ja29wdC1zby1vcmlnaW5hbC1kc3Qtb2NjYXNpb25hbGx5LXJldHVybnMtY2xpZW50LWFkZHJlc3M=
(use above to convert URL for reference).

Side Note: (about how we identified this behavior)
We were using haproxy-version 1.8.14-1 which is using this kernel function getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) to get the destination IP details/connected address details. But instead we are getting very occasional source IP address instead of destination(Connected IP).

Thanks for your attention on this message.

With kind regards,
RGM(Roobesh Mohandass)
Cloud Security Platform

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
  2018-12-31  6:20     ` : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP Mohandass, Roobesh
@ 2019-01-07 11:17       ` Florian Westphal
  2019-01-07 14:57         ` Mohandass, Roobesh
  2019-01-12 14:37           ` [NETDEV]: " Lukas Tribus
  0 siblings, 2 replies; 17+ messages in thread
From: Florian Westphal @ 2019-01-07 11:17 UTC (permalink / raw)
  To: Mohandass, Roobesh; +Cc: netdev

Mohandass, Roobesh <Roobesh_Mohandass@McAfee.com> wrote:
> Issue: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

SO_ORIGINAL_DST returns the original address (before DNAT/REDIRECT rewrite)
that conntrack believes to be the originator.

> What you expected to happen:
> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) - should return always destination IP(connected IP).
> 
> What happened instead:
> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP. Using getsockname() instead looks like solving the issue.
> 
> For just an example: 
> Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is just an rough estimate to give you idea on frequency)
> 
> Some old reference: (similar behavior observed) 
> https://stackoverflow.com/questions/11417187/getsockopt-so-original-dst-occasionally-returns-client-address

I ask exactly same question as in your url.
If you use TPROXY, why do you use SO_ORIGINAL_DST?

Its only required with -j REDIRECT.

If you need to use REDIRECT for some reason, make sure you limit
REDIRECT to -p tcp --syn to prevent it from triggering on other packets.

You might also want to turn off midstream pickup (sysctl
nf_conntrack_tcp_loose=0) in conntrack.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
  2019-01-07 11:17       ` Florian Westphal
@ 2019-01-07 14:57         ` Mohandass, Roobesh
  2019-01-12 14:37           ` [NETDEV]: " Lukas Tribus
  1 sibling, 0 replies; 17+ messages in thread
From: Mohandass, Roobesh @ 2019-01-07 14:57 UTC (permalink / raw)
  To: Florian Westphal, Willy Tarreau; +Cc: netdev

+Willy 

Willy,

Do you mind looking into Florian comments below ? 

-Roobesh G M

-----Original Message-----
From: Florian Westphal <fw@strlen.de> 
Sent: Monday, January 7, 2019 4:48 PM
To: Mohandass, Roobesh <Roobesh_Mohandass@McAfee.com>
Cc: netdev@vger.kernel.org
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Mohandass, Roobesh <Roobesh_Mohandass@McAfee.com> wrote:
> Issue: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact 
> sometimes returning the source IP instead the destination IP

SO_ORIGINAL_DST returns the original address (before DNAT/REDIRECT rewrite) that conntrack believes to be the originator.

> What you expected to happen:
> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) - should return always destination IP(connected IP).
> 
> What happened instead:
> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP. Using getsockname() instead looks like solving the issue.
> 
> For just an example: 
> Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is 
> just an rough estimate to give you idea on frequency)
> 
> Some old reference: (similar behavior observed) 
> https://stackoverflow.com/questions/11417187/getsockopt-so-original-ds
> t-occasionally-returns-client-address

I ask exactly same question as in your url.
If you use TPROXY, why do you use SO_ORIGINAL_DST?

Its only required with -j REDIRECT.

If you need to use REDIRECT for some reason, make sure you limit REDIRECT to -p tcp --syn to prevent it from triggering on other packets.

You might also want to turn off midstream pickup (sysctl
nf_conntrack_tcp_loose=0) in conntrack.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 14:37           ` Lukas Tribus
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Tribus @ 2019-01-12 14:37 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Mohandass Roobesh, Willy Tarreau, netdev

Hello,


>> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes
>> returning the source IP instead the destination IP. Using getsockname()
>> instead looks like solving the issue.
>>
>> For just an example:
>> Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is just
>> an rough estimate to give you idea on frequency)
>>
>> Some old reference: (similar behavior observed)
>> stackoverflow-link (see earlier)
>
> I ask exactly same question as in your url.
> If you use TPROXY, why do you use SO_ORIGINAL_DST?
>
> Its only required with -j REDIRECT.

TPROXY is not used in the OP's scenario. The stackoverflow question was
an example.

The application (haproxy) needs to know the original destination IP
address, however it does not know whether -j REDIRECT was used or not.
Because of this the application always queries SO_ORIGINAL_DST, and
this includes configurations without -j REDIRECT.

Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
used with -j REDIRECT and that this issue does not happen when -j
REDIRECT is actually used?

That would mean the application would have to expose a new configuration
knob to the user, and require the user to toggle the knob based on
-j REDIRECT usage.



Thanks,
Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 14:37           ` Lukas Tribus
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Tribus @ 2019-01-12 14:37 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Mohandass Roobesh, Willy Tarreau, netdev

Hello,


>> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes
>> returning the source IP instead the destination IP. Using getsockname()
>> instead looks like solving the issue.
>>
>> For just an example:
>> Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is just
>> an rough estimate to give you idea on frequency)
>>
>> Some old reference: (similar behavior observed)
>> stackoverflow-link (see earlier)
>
> I ask exactly same question as in your url.
> If you use TPROXY, why do you use SO_ORIGINAL_DST?
>
> Its only required with -j REDIRECT.

TPROXY is not used in the OP's scenario. The stackoverflow question was
an example.

The application (haproxy) needs to know the original destination IP
address, however it does not know whether -j REDIRECT was used or not.
Because of this the application always queries SO_ORIGINAL_DST, and
this includes configurations without -j REDIRECT.

Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
used with -j REDIRECT and that this issue does not happen when -j
REDIRECT is actually used?

That would mean the application would have to expose a new configuration
knob to the user, and require the user to toggle the knob based on
-j REDIRECT usage.



Thanks,
Lukas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 16:04             ` Florian Westphal
  0 siblings, 0 replies; 17+ messages in thread
From: Florian Westphal @ 2019-01-12 16:04 UTC (permalink / raw)
  To: Lukas Tribus; +Cc: Florian Westphal, Mohandass Roobesh, Willy Tarreau, netdev

Lukas Tribus <lists@ltri.eu> wrote:
> The application (haproxy) needs to know the original destination IP
> address, however it does not know whether -j REDIRECT was used or not.
> Because of this the application always queries SO_ORIGINAL_DST, and
> this includes configurations without -j REDIRECT.
> 
> Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
> used with -j REDIRECT and that this issue does not happen when -j
> REDIRECT is actually used?

No, thats not what I said.  Because OP provided a link that mentions
TPROXY, I concluded OP was using TPROXY, so I pointed out that the
error source can be completely avoided by not using SO_ORIGINAL_DST.

As I said, SO_ORIGINAL_DST returns the dst address of
the original direction *as seen by conntrack*.

In case REDIRECT or DNAT was used, the address returned is the on-wire
one, before DNAT rewrite took place.

Therefore, SO_ORIGINAL_DST is only needed when REDIRECT or DNAT was
used. If no DNAT rewrite takes place, sockaddr returned by accept or
getsockname can be used directly and SO_ORIGINAL_DST isn't needed.
The returned address should be identical to the one given by accept().

If SO_ORIGINAL_DST returns the reply, then conntrack picked up
a reply packet as the first packet of the connection, so it believes
originator is the responder and vice versa.

One case where this can happen is when nf_conntrack_tcp_loose
(mid-stream pickup) is enabled.

This is not a haproxy bug.

Only thing that haproxy could is to provide a knob to make it only
use addresses returned by accept, rather than relying on
SO_ORIGINAL_DST for those that use TPROXY to do MITM interception.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 16:04             ` Florian Westphal
  0 siblings, 0 replies; 17+ messages in thread
From: Florian Westphal @ 2019-01-12 16:04 UTC (permalink / raw)
  To: Lukas Tribus; +Cc: Florian Westphal, Mohandass Roobesh, Willy Tarreau, netdev

Lukas Tribus <lists@ltri.eu> wrote:
> The application (haproxy) needs to know the original destination IP
> address, however it does not know whether -j REDIRECT was used or not.
> Because of this the application always queries SO_ORIGINAL_DST, and
> this includes configurations without -j REDIRECT.
> 
> Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
> used with -j REDIRECT and that this issue does not happen when -j
> REDIRECT is actually used?

No, thats not what I said.  Because OP provided a link that mentions
TPROXY, I concluded OP was using TPROXY, so I pointed out that the
error source can be completely avoided by not using SO_ORIGINAL_DST.

As I said, SO_ORIGINAL_DST returns the dst address of
the original direction *as seen by conntrack*.

In case REDIRECT or DNAT was used, the address returned is the on-wire
one, before DNAT rewrite took place.

Therefore, SO_ORIGINAL_DST is only needed when REDIRECT or DNAT was
used. If no DNAT rewrite takes place, sockaddr returned by accept or
getsockname can be used directly and SO_ORIGINAL_DST isn't needed.
The returned address should be identical to the one given by accept().

If SO_ORIGINAL_DST returns the reply, then conntrack picked up
a reply packet as the first packet of the connection, so it believes
originator is the responder and vice versa.

One case where this can happen is when nf_conntrack_tcp_loose
(mid-stream pickup) is enabled.

This is not a haproxy bug.

Only thing that haproxy could is to provide a knob to make it only
use addresses returned by accept, rather than relying on
SO_ORIGINAL_DST for those that use TPROXY to do MITM interception.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 17:26               ` Willy Tarreau
  0 siblings, 0 replies; 17+ messages in thread
From: Willy Tarreau @ 2019-01-12 17:26 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Lukas Tribus, Mohandass Roobesh, netdev

Hi Florian!

Sorry for the lag, I was busy with other issues.

On Sat, Jan 12, 2019 at 05:04:00PM +0100, Florian Westphal wrote:
> Lukas Tribus <lists@ltri.eu> wrote:
> > The application (haproxy) needs to know the original destination IP
> > address, however it does not know whether -j REDIRECT was used or not.
> > Because of this the application always queries SO_ORIGINAL_DST, and
> > this includes configurations without -j REDIRECT.
> > 
> > Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
> > used with -j REDIRECT and that this issue does not happen when -j
> > REDIRECT is actually used?
> 
> No, thats not what I said.  Because OP provided a link that mentions
> TPROXY, I concluded OP was using TPROXY, so I pointed out that the
> error source can be completely avoided by not using SO_ORIGINAL_DST.
> 
> As I said, SO_ORIGINAL_DST returns the dst address of
> the original direction *as seen by conntrack*.
> 
> In case REDIRECT or DNAT was used, the address returned is the on-wire
> one, before DNAT rewrite took place.
>
> Therefore, SO_ORIGINAL_DST is only needed when REDIRECT or DNAT was
> used. If no DNAT rewrite takes place, sockaddr returned by accept or
> getsockname can be used directly and SO_ORIGINAL_DST isn't needed.
> The returned address should be identical to the one given by accept().

That's also the way we're using it :

     http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_tcp.c;h=52a4691d2ba93aec93a3e8a0edd1e90d93de968d;hb=HEAD#l600

More precisely if getsockopt() fails we only use getsockname(). It
happens that this code is very old (2002) and used to work with kernel
2.4's TPROXY which did rely on conntrack, hence the ifdef including
TPROXY. But it's irrelevant here, with a modern TPROXY the getsockopt()
here will usually fail, and the result will come from getsockname()
only.

When seeing the (old) haproxy code there, I've been thinking about
another possibility. Right now haproxy does getsockname() then *tries*
getsockopt(). In the unlikely case getsockopt() would modify part of
the address already returned by getsockname(), it could leave an
incorrect address there. But such an issue would require partial uses
of copy_to_user() which doesn't make much sense, and getorigdst()
doesn't do this either. So this one was ruled out.

I've been wondering if it was possible that from time to time, this
getsockopt() accidently succeeds and returns something wrong, maybe
due to a race with another thread, or maybe because it would return
an uninitialized field which happened to see the source address
previously. The very low amount of occurrence makes me think it could
possibly happen only upon certain error conditions. But looking at
getorigdst(), I hardly imagine how this would happen.

> If SO_ORIGINAL_DST returns the reply, then conntrack picked up
> a reply packet as the first packet of the connection, so it believes
> originator is the responder and vice versa.
> 
> One case where this can happen is when nf_conntrack_tcp_loose
> (mid-stream pickup) is enabled.

Very interesting case indeed, I hadn't thought about it! I think we
don't have enough info from the original reporter's setup but it
would definitely make sense and explain why it's the other end which
is retrieved!

I'm seeing one possibility to explain this : Let's say the OP's setup
has a short conntrack timeout and a longer haproxy timeout. If the
address is only retrieved for logging, it will be retrieved at the
end of the connection. Let's assume haproxy receives a request from
a client, a conntrack entry is created and haproxy forwards the request
to a very slow server. Before the server responds, the conntrack entry
expires, then the server responds and haproxy forwards to the client,
re-creating the entry and hitting this case before the address is
picked up for logging.

Roobesh, do you use the destination address only for logging or
anywhere else in the request path ? And could you check if you have
nf_conntrack_tcp_loose set as Florian suggests ? I really think he
figured it right.

If that's the problem, I think we can address it with documentation :
first, warn about the use case, second, explain how to store the
address into a variable once the connection is accepted since it will
not have expired and will still be valid at this moment.

> This is not a haproxy bug.
> 
> Only thing that haproxy could is to provide a knob to make it only
> use addresses returned by accept, rather than relying on
> SO_ORIGINAL_DST for those that use TPROXY to do MITM interception.

Since it's the destination we get it using getsockname(), not accept(),
but yeah maybe we'd benefit from having a tunable to disable the use of
getsockopt(SO_ORIGINAL_DST). Actually I'd prefer to avoid doing this
until the issue it confirmed because I don't feel comfortable with having
an option saying "disable the use of getsockopt(SO_ORIGINAL_DST) which
may fail once per million due to some obscure reasons". However if it's
confirmed that it indeed is the scenario above that happens, we could
possibly think about adding a setting to work around if the doc approach
is not enough.

Many thanks for your insights, these are exactly the reason I asked
Roobesh to bring the issue here :-)

Cheers,
Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 17:26               ` Willy Tarreau
  0 siblings, 0 replies; 17+ messages in thread
From: Willy Tarreau @ 2019-01-12 17:26 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Lukas Tribus, Mohandass Roobesh, netdev

Hi Florian!

Sorry for the lag, I was busy with other issues.

On Sat, Jan 12, 2019 at 05:04:00PM +0100, Florian Westphal wrote:
> Lukas Tribus <lists@ltri.eu> wrote:
> > The application (haproxy) needs to know the original destination IP
> > address, however it does not know whether -j REDIRECT was used or not.
> > Because of this the application always queries SO_ORIGINAL_DST, and
> > this includes configurations without -j REDIRECT.
> > 
> > Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
> > used with -j REDIRECT and that this issue does not happen when -j
> > REDIRECT is actually used?
> 
> No, thats not what I said.  Because OP provided a link that mentions
> TPROXY, I concluded OP was using TPROXY, so I pointed out that the
> error source can be completely avoided by not using SO_ORIGINAL_DST.
> 
> As I said, SO_ORIGINAL_DST returns the dst address of
> the original direction *as seen by conntrack*.
> 
> In case REDIRECT or DNAT was used, the address returned is the on-wire
> one, before DNAT rewrite took place.
>
> Therefore, SO_ORIGINAL_DST is only needed when REDIRECT or DNAT was
> used. If no DNAT rewrite takes place, sockaddr returned by accept or
> getsockname can be used directly and SO_ORIGINAL_DST isn't needed.
> The returned address should be identical to the one given by accept().

That's also the way we're using it :

     http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_tcp.c;h=52a4691d2ba93aec93a3e8a0edd1e90d93de968d;hb=HEAD#l600

More precisely if getsockopt() fails we only use getsockname(). It
happens that this code is very old (2002) and used to work with kernel
2.4's TPROXY which did rely on conntrack, hence the ifdef including
TPROXY. But it's irrelevant here, with a modern TPROXY the getsockopt()
here will usually fail, and the result will come from getsockname()
only.

When seeing the (old) haproxy code there, I've been thinking about
another possibility. Right now haproxy does getsockname() then *tries*
getsockopt(). In the unlikely case getsockopt() would modify part of
the address already returned by getsockname(), it could leave an
incorrect address there. But such an issue would require partial uses
of copy_to_user() which doesn't make much sense, and getorigdst()
doesn't do this either. So this one was ruled out.

I've been wondering if it was possible that from time to time, this
getsockopt() accidently succeeds and returns something wrong, maybe
due to a race with another thread, or maybe because it would return
an uninitialized field which happened to see the source address
previously. The very low amount of occurrence makes me think it could
possibly happen only upon certain error conditions. But looking at
getorigdst(), I hardly imagine how this would happen.

> If SO_ORIGINAL_DST returns the reply, then conntrack picked up
> a reply packet as the first packet of the connection, so it believes
> originator is the responder and vice versa.
> 
> One case where this can happen is when nf_conntrack_tcp_loose
> (mid-stream pickup) is enabled.

Very interesting case indeed, I hadn't thought about it! I think we
don't have enough info from the original reporter's setup but it
would definitely make sense and explain why it's the other end which
is retrieved!

I'm seeing one possibility to explain this : Let's say the OP's setup
has a short conntrack timeout and a longer haproxy timeout. If the
address is only retrieved for logging, it will be retrieved at the
end of the connection. Let's assume haproxy receives a request from
a client, a conntrack entry is created and haproxy forwards the request
to a very slow server. Before the server responds, the conntrack entry
expires, then the server responds and haproxy forwards to the client,
re-creating the entry and hitting this case before the address is
picked up for logging.

Roobesh, do you use the destination address only for logging or
anywhere else in the request path ? And could you check if you have
nf_conntrack_tcp_loose set as Florian suggests ? I really think he
figured it right.

If that's the problem, I think we can address it with documentation :
first, warn about the use case, second, explain how to store the
address into a variable once the connection is accepted since it will
not have expired and will still be valid at this moment.

> This is not a haproxy bug.
> 
> Only thing that haproxy could is to provide a knob to make it only
> use addresses returned by accept, rather than relying on
> SO_ORIGINAL_DST for those that use TPROXY to do MITM interception.

Since it's the destination we get it using getsockname(), not accept(),
but yeah maybe we'd benefit from having a tunable to disable the use of
getsockopt(SO_ORIGINAL_DST). Actually I'd prefer to avoid doing this
until the issue it confirmed because I don't feel comfortable with having
an option saying "disable the use of getsockopt(SO_ORIGINAL_DST) which
may fail once per million due to some obscure reasons". However if it's
confirmed that it indeed is the scenario above that happens, we could
possibly think about adding a setting to work around if the doc approach
is not enough.

Many thanks for your insights, these are exactly the reason I asked
Roobesh to bring the issue here :-)

Cheers,
Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 18:01                 ` Lukas Tribus
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Tribus @ 2019-01-12 18:01 UTC (permalink / raw)
  To: Willy Tarreau, Mohandass Roobesh; +Cc: Florian Westphal, netdev

Hello!


On Sat, 12 Jan 2019 at 18:26, Willy Tarreau <w@1wt.eu> wrote:
> > One case where this can happen is when nf_conntrack_tcp_loose
> > (mid-stream pickup) is enabled.
>
> Very interesting case indeed, I hadn't thought about it! I think we
> don't have enough info from the original reporter's setup but it
> would definitely make sense and explain why it's the other end which
> is retrieved!
>
> I'm seeing one possibility to explain this : Let's say the OP's setup
> has a short conntrack timeout and a longer haproxy timeout. If the
> address is only retrieved for logging, it will be retrieved at the
> end of the connection. Let's assume haproxy receives a request from
> a client, a conntrack entry is created and haproxy forwards the request
> to a very slow server. Before the server responds, the conntrack entry
> expires, then the server responds and haproxy forwards to the client,
> re-creating the entry and hitting this case before the address is
> picked up for logging.
>
> Roobesh, do you use the destination address only for logging or
> anywhere else in the request path ? And could you check if you have
> nf_conntrack_tcp_loose set as Florian suggests ? I really think he
> figured it right.

It's about what we send with the PROXY protocol to the backend server,
Roobesh reported things like that (src and dst is the same):

PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066
PROXY TCP4 192.220.26.39 192.220.26.39 45075 45075

So the call would actually happen at the beginning of the TCP connection.

Initial report is here:
https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-with-proxy-ip-port-details/3336


Let's see if disabling nf_conntrack_tcp_loose changes things.



Thanks,
Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 18:01                 ` Lukas Tribus
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Tribus @ 2019-01-12 18:01 UTC (permalink / raw)
  To: Willy Tarreau, Mohandass Roobesh; +Cc: Florian Westphal, netdev

Hello!


On Sat, 12 Jan 2019 at 18:26, Willy Tarreau <w@1wt.eu> wrote:
> > One case where this can happen is when nf_conntrack_tcp_loose
> > (mid-stream pickup) is enabled.
>
> Very interesting case indeed, I hadn't thought about it! I think we
> don't have enough info from the original reporter's setup but it
> would definitely make sense and explain why it's the other end which
> is retrieved!
>
> I'm seeing one possibility to explain this : Let's say the OP's setup
> has a short conntrack timeout and a longer haproxy timeout. If the
> address is only retrieved for logging, it will be retrieved at the
> end of the connection. Let's assume haproxy receives a request from
> a client, a conntrack entry is created and haproxy forwards the request
> to a very slow server. Before the server responds, the conntrack entry
> expires, then the server responds and haproxy forwards to the client,
> re-creating the entry and hitting this case before the address is
> picked up for logging.
>
> Roobesh, do you use the destination address only for logging or
> anywhere else in the request path ? And could you check if you have
> nf_conntrack_tcp_loose set as Florian suggests ? I really think he
> figured it right.

It's about what we send with the PROXY protocol to the backend server,
Roobesh reported things like that (src and dst is the same):

PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066
PROXY TCP4 192.220.26.39 192.220.26.39 45075 45075

So the call would actually happen at the beginning of the TCP connection.

Initial report is here:
https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-with-proxy-ip-port-details/3336


Let's see if disabling nf_conntrack_tcp_loose changes things.



Thanks,
Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 18:33                   ` Willy Tarreau
  0 siblings, 0 replies; 17+ messages in thread
From: Willy Tarreau @ 2019-01-12 18:33 UTC (permalink / raw)
  To: Lukas Tribus; +Cc: Mohandass Roobesh, Florian Westphal, netdev

Hi Lukas,

On Sat, Jan 12, 2019 at 07:01:34PM +0100, Lukas Tribus wrote:
> > Roobesh, do you use the destination address only for logging or
> > anywhere else in the request path ? And could you check if you have
> > nf_conntrack_tcp_loose set as Florian suggests ? I really think he
> > figured it right.
> 
> It's about what we send with the PROXY protocol to the backend server,
> Roobesh reported things like that (src and dst is the same):
> 
> PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066
> PROXY TCP4 192.220.26.39 192.220.26.39 45075 45075
> 
> So the call would actually happen at the beginning of the TCP connection.

That sounds quite shocking to me then. Maybe we're facing such a sequence:

 1) first session comes from this port, and client closes first (FIN)
 2) haproxy then closes with (FIN)
 3) client doesn't respond with the final ACK (e.g. blocked by another
    firewall in between or the client's own conntrack)
 4) the socket on the haproxy side remains in LAST_ACK state and ACKs
    are periodically resent
 5) local conntrack is in TIME_WAIT and expires faster than the largest
    interval between two such ACKs
 6) one of these retransmitted ACKs reopens the connection in reverse
    direction due to nf_conntrack_tcp_loose. The connection is then
    seen in ESTABLISHED state and might be kept a bit longer.
 8) the connection finally expires in the local TCP stack but not yet
    in conntrack.
 7) later the client reuses the same source port while the connection
    is still present in the conntrack table.
 8) assuming tcp_be_liberal is also set, the packets can pass through
    conntrack and establish a new connection to haproxy.
 9) haproxy calls getsockopt(SO_ORIGINAL_DST) and gets the other end
    point since the connection was created at step 6 above in the
    other direction.

I could be wrong on certain specific points above but it looks plausible. 

> Initial report is here:
> https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-with-proxy-ip-port-details/3336

Ah cool, I didn't have all this, thank you!

> Let's see if disabling nf_conntrack_tcp_loose changes things.

Yes this really is the only thing I can think of, and in this case
noone is wrong in this chain (neither kernel nor haproxy). We'll
need to document it in my opinion.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
@ 2019-01-12 18:33                   ` Willy Tarreau
  0 siblings, 0 replies; 17+ messages in thread
From: Willy Tarreau @ 2019-01-12 18:33 UTC (permalink / raw)
  To: Lukas Tribus; +Cc: Mohandass Roobesh, Florian Westphal, netdev

Hi Lukas,

On Sat, Jan 12, 2019 at 07:01:34PM +0100, Lukas Tribus wrote:
> > Roobesh, do you use the destination address only for logging or
> > anywhere else in the request path ? And could you check if you have
> > nf_conntrack_tcp_loose set as Florian suggests ? I really think he
> > figured it right.
> 
> It's about what we send with the PROXY protocol to the backend server,
> Roobesh reported things like that (src and dst is the same):
> 
> PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066
> PROXY TCP4 192.220.26.39 192.220.26.39 45075 45075
> 
> So the call would actually happen at the beginning of the TCP connection.

That sounds quite shocking to me then. Maybe we're facing such a sequence:

 1) first session comes from this port, and client closes first (FIN)
 2) haproxy then closes with (FIN)
 3) client doesn't respond with the final ACK (e.g. blocked by another
    firewall in between or the client's own conntrack)
 4) the socket on the haproxy side remains in LAST_ACK state and ACKs
    are periodically resent
 5) local conntrack is in TIME_WAIT and expires faster than the largest
    interval between two such ACKs
 6) one of these retransmitted ACKs reopens the connection in reverse
    direction due to nf_conntrack_tcp_loose. The connection is then
    seen in ESTABLISHED state and might be kept a bit longer.
 8) the connection finally expires in the local TCP stack but not yet
    in conntrack.
 7) later the client reuses the same source port while the connection
    is still present in the conntrack table.
 8) assuming tcp_be_liberal is also set, the packets can pass through
    conntrack and establish a new connection to haproxy.
 9) haproxy calls getsockopt(SO_ORIGINAL_DST) and gets the other end
    point since the connection was created at step 6 above in the
    other direction.

I could be wrong on certain specific points above but it looks plausible. 

> Initial report is here:
> https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-with-proxy-ip-port-details/3336

Ah cool, I didn't have all this, thank you!

> Let's see if disabling nf_conntrack_tcp_loose changes things.

Yes this really is the only thing I can think of, and in this case
noone is wrong in this chain (neither kernel nor haproxy). We'll
need to document it in my opinion.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
  2019-01-12 18:33                   ` [NETDEV]: " Willy Tarreau
  (?)
@ 2019-01-17  5:23                   ` Mohandass, Roobesh
  2019-01-23  8:07                     ` Mohandass, Roobesh
  -1 siblings, 1 reply; 17+ messages in thread
From: Mohandass, Roobesh @ 2019-01-17  5:23 UTC (permalink / raw)
  To: Willy Tarreau, Lukas Tribus; +Cc: Florian Westphal, netdev

[-- Attachment #1: Type: text/plain, Size: 3113 bytes --]

Hi Willy/Florian/Lukas,

Thanks for your help around this concern, sorry for the delayed response. I will test this out and get back to you.

-Roobesh G M

-----Original Message-----
From: Willy Tarreau <w@1wt.eu> 
Sent: Sunday, January 13, 2019 12:04 AM
To: Lukas Tribus <lists@ltri.eu>
Cc: Mohandass, Roobesh <Roobesh_Mohandass@McAfee.com>; Florian Westphal <fw@strlen.de>; netdev@vger.kernel.org
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi Lukas,

On Sat, Jan 12, 2019 at 07:01:34PM +0100, Lukas Tribus wrote:
> > Roobesh, do you use the destination address only for logging or 
> > anywhere else in the request path ? And could you check if you have 
> > nf_conntrack_tcp_loose set as Florian suggests ? I really think he 
> > figured it right.
> 
> It's about what we send with the PROXY protocol to the backend server, 
> Roobesh reported things like that (src and dst is the same):
> 
> PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066 PROXY TCP4 
> 192.220.26.39 192.220.26.39 45075 45075
> 
> So the call would actually happen at the beginning of the TCP connection.

That sounds quite shocking to me then. Maybe we're facing such a sequence:

 1) first session comes from this port, and client closes first (FIN)
 2) haproxy then closes with (FIN)
 3) client doesn't respond with the final ACK (e.g. blocked by another
    firewall in between or the client's own conntrack)
 4) the socket on the haproxy side remains in LAST_ACK state and ACKs
    are periodically resent
 5) local conntrack is in TIME_WAIT and expires faster than the largest
    interval between two such ACKs
 6) one of these retransmitted ACKs reopens the connection in reverse
    direction due to nf_conntrack_tcp_loose. The connection is then
    seen in ESTABLISHED state and might be kept a bit longer.
 8) the connection finally expires in the local TCP stack but not yet
    in conntrack.
 7) later the client reuses the same source port while the connection
    is still present in the conntrack table.
 8) assuming tcp_be_liberal is also set, the packets can pass through
    conntrack and establish a new connection to haproxy.
 9) haproxy calls getsockopt(SO_ORIGINAL_DST) and gets the other end
    point since the connection was created at step 6 above in the
    other direction.

I could be wrong on certain specific points above but it looks plausible. 

> Initial report is here:
> https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-
> with-proxy-ip-port-details/3336

Ah cool, I didn't have all this, thank you!

> Let's see if disabling nf_conntrack_tcp_loose changes things.

Yes this really is the only thing I can think of, and in this case noone is wrong in this chain (neither kernel nor haproxy). We'll need to document it in my opinion.

Thanks,
Willy

[-- Attachment #2: Type: message/rfc822, Size: 11046 bytes --]

From: Willy Tarreau <w@1wt.eu>
To: Florian Westphal <fw@strlen.de>
Cc: Lukas Tribus <lists@ltri.eu>, "Mohandass, Roobesh" <Roobesh_Mohandass@McAfee.com>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
Date: Sat, 12 Jan 2019 17:26:36 +0000
Message-ID: <20190112172636.GA26639@1wt.eu>

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi Florian!

Sorry for the lag, I was busy with other issues.

On Sat, Jan 12, 2019 at 05:04:00PM +0100, Florian Westphal wrote:
> Lukas Tribus <lists@ltri.eu> wrote:
> > The application (haproxy) needs to know the original destination IP
> > address, however it does not know whether -j REDIRECT was used or not.
> > Because of this the application always queries SO_ORIGINAL_DST, and
> > this includes configurations without -j REDIRECT.
> >
> > Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
> > used with -j REDIRECT and that this issue does not happen when -j
> > REDIRECT is actually used?
>
> No, thats not what I said.  Because OP provided a link that mentions
> TPROXY, I concluded OP was using TPROXY, so I pointed out that the
> error source can be completely avoided by not using SO_ORIGINAL_DST.
>
> As I said, SO_ORIGINAL_DST returns the dst address of
> the original direction *as seen by conntrack*.
>
> In case REDIRECT or DNAT was used, the address returned is the on-wire
> one, before DNAT rewrite took place.
>
> Therefore, SO_ORIGINAL_DST is only needed when REDIRECT or DNAT was
> used. If no DNAT rewrite takes place, sockaddr returned by accept or
> getsockname can be used directly and SO_ORIGINAL_DST isn't needed.
> The returned address should be identical to the one given by accept().

That's also the way we're using it :

     http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/proto_tcp.c;h=52a4691d2ba93aec93a3e8a0edd1e90d93de968d;hb=HEAD#l600

More precisely if getsockopt() fails we only use getsockname(). It
happens that this code is very old (2002) and used to work with kernel
2.4's TPROXY which did rely on conntrack, hence the ifdef including
TPROXY. But it's irrelevant here, with a modern TPROXY the getsockopt()
here will usually fail, and the result will come from getsockname()
only.

When seeing the (old) haproxy code there, I've been thinking about
another possibility. Right now haproxy does getsockname() then *tries*
getsockopt(). In the unlikely case getsockopt() would modify part of
the address already returned by getsockname(), it could leave an
incorrect address there. But such an issue would require partial uses
of copy_to_user() which doesn't make much sense, and getorigdst()
doesn't do this either. So this one was ruled out.

I've been wondering if it was possible that from time to time, this
getsockopt() accidently succeeds and returns something wrong, maybe
due to a race with another thread, or maybe because it would return
an uninitialized field which happened to see the source address
previously. The very low amount of occurrence makes me think it could
possibly happen only upon certain error conditions. But looking at
getorigdst(), I hardly imagine how this would happen.

> If SO_ORIGINAL_DST returns the reply, then conntrack picked up
> a reply packet as the first packet of the connection, so it believes
> originator is the responder and vice versa.
>
> One case where this can happen is when nf_conntrack_tcp_loose
> (mid-stream pickup) is enabled.

Very interesting case indeed, I hadn't thought about it! I think we
don't have enough info from the original reporter's setup but it
would definitely make sense and explain why it's the other end which
is retrieved!

I'm seeing one possibility to explain this : Let's say the OP's setup
has a short conntrack timeout and a longer haproxy timeout. If the
address is only retrieved for logging, it will be retrieved at the
end of the connection. Let's assume haproxy receives a request from
a client, a conntrack entry is created and haproxy forwards the request
to a very slow server. Before the server responds, the conntrack entry
expires, then the server responds and haproxy forwards to the client,
re-creating the entry and hitting this case before the address is
picked up for logging.

Roobesh, do you use the destination address only for logging or
anywhere else in the request path ? And could you check if you have
nf_conntrack_tcp_loose set as Florian suggests ? I really think he
figured it right.

If that's the problem, I think we can address it with documentation :
first, warn about the use case, second, explain how to store the
address into a variable once the connection is accepted since it will
not have expired and will still be valid at this moment.

> This is not a haproxy bug.
>
> Only thing that haproxy could is to provide a knob to make it only
> use addresses returned by accept, rather than relying on
> SO_ORIGINAL_DST for those that use TPROXY to do MITM interception.

Since it's the destination we get it using getsockname(), not accept(),
but yeah maybe we'd benefit from having a tunable to disable the use of
getsockopt(SO_ORIGINAL_DST). Actually I'd prefer to avoid doing this
until the issue it confirmed because I don't feel comfortable with having
an option saying "disable the use of getsockopt(SO_ORIGINAL_DST) which
may fail once per million due to some obscure reasons". However if it's
confirmed that it indeed is the scenario above that happens, we could
possibly think about adding a setting to work around if the doc approach
is not enough.

Many thanks for your insights, these are exactly the reason I asked
Roobesh to bring the issue here :-)

Cheers,
Willy

[-- Attachment #3: Type: message/rfc822, Size: 7531 bytes --]

From: Florian Westphal <fw@strlen.de>
To: Lukas Tribus <lists@ltri.eu>
Cc: Florian Westphal <fw@strlen.de>, "Mohandass, Roobesh" <Roobesh_Mohandass@McAfee.com>, Willy Tarreau <w@1wt.eu>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
Date: Sat, 12 Jan 2019 16:04:00 +0000
Message-ID: <20190112160400.dblitzk2ftlfzryd@breakpoint.cc>

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Lukas Tribus <lists@ltri.eu> wrote:
> The application (haproxy) needs to know the original destination IP
> address, however it does not know whether -j REDIRECT was used or not.
> Because of this the application always queries SO_ORIGINAL_DST, and
> this includes configurations without -j REDIRECT.
>
> Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
> used with -j REDIRECT and that this issue does not happen when -j
> REDIRECT is actually used?

No, thats not what I said.  Because OP provided a link that mentions
TPROXY, I concluded OP was using TPROXY, so I pointed out that the
error source can be completely avoided by not using SO_ORIGINAL_DST.

As I said, SO_ORIGINAL_DST returns the dst address of
the original direction *as seen by conntrack*.

In case REDIRECT or DNAT was used, the address returned is the on-wire
one, before DNAT rewrite took place.

Therefore, SO_ORIGINAL_DST is only needed when REDIRECT or DNAT was
used. If no DNAT rewrite takes place, sockaddr returned by accept or
getsockname can be used directly and SO_ORIGINAL_DST isn't needed.
The returned address should be identical to the one given by accept().

If SO_ORIGINAL_DST returns the reply, then conntrack picked up
a reply packet as the first packet of the connection, so it believes
originator is the responder and vice versa.

One case where this can happen is when nf_conntrack_tcp_loose
(mid-stream pickup) is enabled.

This is not a haproxy bug.

Only thing that haproxy could is to provide a knob to make it only
use addresses returned by accept, rather than relying on
SO_ORIGINAL_DST for those that use TPROXY to do MITM interception.

[-- Attachment #4: Type: message/rfc822, Size: 7573 bytes --]

From: Lukas Tribus <lists@ltri.eu>
To: Florian Westphal <fw@strlen.de>
Cc: "Mohandass, Roobesh" <Roobesh_Mohandass@McAfee.com>, Willy Tarreau <w@1wt.eu>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
Date: Sat, 12 Jan 2019 14:37:11 +0000
Message-ID: <2e83651c-df8b-8341-4170-df328e3d756a@ltri.eu>

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hello,


>> getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes
>> returning the source IP instead the destination IP. Using getsockname()
>> instead looks like solving the issue.
>>
>> For just an example:
>> Out of 6569124 requests , 4 requests were wrong 0.000060891 % (this is just
>> an rough estimate to give you idea on frequency)
>>
>> Some old reference: (similar behavior observed)
>> stackoverflow-link (see earlier)
>
> I ask exactly same question as in your url.
> If you use TPROXY, why do you use SO_ORIGINAL_DST?
>
> Its only required with -j REDIRECT.

TPROXY is not used in the OP's scenario. The stackoverflow question was
an example.

The application (haproxy) needs to know the original destination IP
address, however it does not know whether -j REDIRECT was used or not.
Because of this the application always queries SO_ORIGINAL_DST, and
this includes configurations without -j REDIRECT.

Are you saying the behavior of SO_ORIGINAL_DST is undefined when not
used with -j REDIRECT and that this issue does not happen when -j
REDIRECT is actually used?

That would mean the application would have to expose a new configuration
knob to the user, and require the user to toggle the knob based on
-j REDIRECT usage.



Thanks,
Lukas


[-- Attachment #5: Type: message/rfc822, Size: 8642 bytes --]

From: Lukas Tribus <lists@ltri.eu>
To: Willy Tarreau <w@1wt.eu>, "Mohandass, Roobesh" <Roobesh_Mohandass@McAfee.com>
Cc: Florian Westphal <fw@strlen.de>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
Date: Sat, 12 Jan 2019 18:01:34 +0000
Message-ID: <CACC_My-0bsFPqSaNCsF2bZhNSo_3o1F+qfy8utKRO7-2CZZqYg@mail.gmail.com>

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hello!


On Sat, 12 Jan 2019 at 18:26, Willy Tarreau <w@1wt.eu> wrote:
> > One case where this can happen is when nf_conntrack_tcp_loose
> > (mid-stream pickup) is enabled.
>
> Very interesting case indeed, I hadn't thought about it! I think we
> don't have enough info from the original reporter's setup but it
> would definitely make sense and explain why it's the other end which
> is retrieved!
>
> I'm seeing one possibility to explain this : Let's say the OP's setup
> has a short conntrack timeout and a longer haproxy timeout. If the
> address is only retrieved for logging, it will be retrieved at the
> end of the connection. Let's assume haproxy receives a request from
> a client, a conntrack entry is created and haproxy forwards the request
> to a very slow server. Before the server responds, the conntrack entry
> expires, then the server responds and haproxy forwards to the client,
> re-creating the entry and hitting this case before the address is
> picked up for logging.
>
> Roobesh, do you use the destination address only for logging or
> anywhere else in the request path ? And could you check if you have
> nf_conntrack_tcp_loose set as Florian suggests ? I really think he
> figured it right.

It's about what we send with the PROXY protocol to the backend server,
Roobesh reported things like that (src and dst is the same):

PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066
PROXY TCP4 192.220.26.39 192.220.26.39 45075 45075

So the call would actually happen at the beginning of the TCP connection.

Initial report is here:
https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-with-proxy-ip-port-details/3336


Let's see if disabling nf_conntrack_tcp_loose changes things.



Thanks,
Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
  2019-01-17  5:23                   ` Mohandass, Roobesh
@ 2019-01-23  8:07                     ` Mohandass, Roobesh
  2019-01-23  8:54                       ` Willy Tarreau
  0 siblings, 1 reply; 17+ messages in thread
From: Mohandass, Roobesh @ 2019-01-23  8:07 UTC (permalink / raw)
  To: Willy Tarreau, Lukas Tribus; +Cc: Florian Westphal, netdev

Hi Willy/Florian/Lukas,

 disabling nf_conntrack_tcp_loose (solved the problem) and for last 24 hours, we are not evening seeing a single wrong connection data by send-proxy.

Worth a note in your documentation related to this, as users might be aware off.

Thanks very much for your help and support, I will report back if I see anything again related to this.

-Roobesh G M

-----Original Message-----
From: Mohandass, Roobesh 
Sent: Thursday, January 17, 2019 10:54 AM
To: Willy Tarreau <w@1wt.eu>; Lukas Tribus <lists@ltri.eu>
Cc: Florian Westphal <fw@strlen.de>; netdev@vger.kernel.org
Subject: RE: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

Hi Willy/Florian/Lukas,

Thanks for your help around this concern, sorry for the delayed response. I will test this out and get back to you.

-Roobesh G M

-----Original Message-----
From: Willy Tarreau <w@1wt.eu>
Sent: Sunday, January 13, 2019 12:04 AM
To: Lukas Tribus <lists@ltri.eu>
Cc: Mohandass, Roobesh <Roobesh_Mohandass@McAfee.com>; Florian Westphal <fw@strlen.de>; netdev@vger.kernel.org
Subject: Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP

This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi Lukas,

On Sat, Jan 12, 2019 at 07:01:34PM +0100, Lukas Tribus wrote:
> > Roobesh, do you use the destination address only for logging or 
> > anywhere else in the request path ? And could you check if you have 
> > nf_conntrack_tcp_loose set as Florian suggests ? I really think he 
> > figured it right.
> 
> It's about what we send with the PROXY protocol to the backend server, 
> Roobesh reported things like that (src and dst is the same):
> 
> PROXY TCP4 192.220.26.39 192.220.26.39 45066 45066 PROXY TCP4
> 192.220.26.39 192.220.26.39 45075 45075
> 
> So the call would actually happen at the beginning of the TCP connection.

That sounds quite shocking to me then. Maybe we're facing such a sequence:

 1) first session comes from this port, and client closes first (FIN)
 2) haproxy then closes with (FIN)
 3) client doesn't respond with the final ACK (e.g. blocked by another
    firewall in between or the client's own conntrack)
 4) the socket on the haproxy side remains in LAST_ACK state and ACKs
    are periodically resent
 5) local conntrack is in TIME_WAIT and expires faster than the largest
    interval between two such ACKs
 6) one of these retransmitted ACKs reopens the connection in reverse
    direction due to nf_conntrack_tcp_loose. The connection is then
    seen in ESTABLISHED state and might be kept a bit longer.
 8) the connection finally expires in the local TCP stack but not yet
    in conntrack.
 7) later the client reuses the same source port while the connection
    is still present in the conntrack table.
 8) assuming tcp_be_liberal is also set, the packets can pass through
    conntrack and establish a new connection to haproxy.
 9) haproxy calls getsockopt(SO_ORIGINAL_DST) and gets the other end
    point since the connection was created at step 6 above in the
    other direction.

I could be wrong on certain specific points above but it looks plausible. 

> Initial report is here:
> https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-
> with-proxy-ip-port-details/3336

Ah cool, I didn't have all this, thank you!

> Let's see if disabling nf_conntrack_tcp_loose changes things.

Yes this really is the only thing I can think of, and in this case noone is wrong in this chain (neither kernel nor haproxy). We'll need to document it in my opinion.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [NETDEV]: getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP
  2019-01-23  8:07                     ` Mohandass, Roobesh
@ 2019-01-23  8:54                       ` Willy Tarreau
  0 siblings, 0 replies; 17+ messages in thread
From: Willy Tarreau @ 2019-01-23  8:54 UTC (permalink / raw)
  To: Mohandass, Roobesh; +Cc: Lukas Tribus, Florian Westphal, netdev

Hi Roobesh,

On Wed, Jan 23, 2019 at 08:07:48AM +0000, Mohandass, Roobesh wrote:
> Hi Willy/Florian/Lukas,
> 
>  disabling nf_conntrack_tcp_loose (solved the problem) and for last 24 hours,
>  we are not evening seeing a single wrong connection data by send-proxy.
> 
> Worth a note in your documentation related to this, as users might be aware
> off.
> 
> Thanks very much for your help and support, I will report back if I see
> anything again related to this.

Thank you for your feedback on this issue, and kudos to Florian
for spotting the root cause!

Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-01-23  8:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <MWHPR16MB1502549F7BAAE102EF1D5DF4EDB50@MWHPR16MB1502.namprd16.prod.outlook.com>
     [not found] ` <MWHPR16MB150233EF69CFD4562BFA65ABEDB60@MWHPR16MB1502.namprd16.prod.outlook.com>
     [not found]   ` <MWHPR16MB1502DC55542EBA8121631B16EDB20@MWHPR16MB1502.namprd16.prod.outlook.com>
2018-12-31  6:20     ` : getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, sa, &salen) is in fact sometimes returning the source IP instead the destination IP Mohandass, Roobesh
2019-01-07 11:17       ` Florian Westphal
2019-01-07 14:57         ` Mohandass, Roobesh
2019-01-12 14:37         ` Lukas Tribus
2019-01-12 14:37           ` [NETDEV]: " Lukas Tribus
2019-01-12 16:04           ` : " Florian Westphal
2019-01-12 16:04             ` [NETDEV]: " Florian Westphal
2019-01-12 17:26             ` : " Willy Tarreau
2019-01-12 17:26               ` [NETDEV]: " Willy Tarreau
2019-01-12 18:01               ` : " Lukas Tribus
2019-01-12 18:01                 ` [NETDEV]: " Lukas Tribus
2019-01-12 18:33                 ` : " Willy Tarreau
2019-01-12 18:33                   ` [NETDEV]: " Willy Tarreau
2019-01-17  5:23                   ` Mohandass, Roobesh
2019-01-23  8:07                     ` Mohandass, Roobesh
2019-01-23  8:54                       ` Willy Tarreau
     [not found]     ` <MWHPR16MB1502A336CCB4EF3F0FCB69ECEDB20@MWHPR16MB1502.namprd16.prod.outlook.com>
2019-01-07  7:58       ` : " Mohandass, Roobesh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.