All of lore.kernel.org
 help / color / mirror / Atom feed
* SNMP mangling anybody?
@ 2017-11-29 16:27 FAIR, ED
  2017-11-30 11:02 ` Robert White
  0 siblings, 1 reply; 11+ messages in thread
From: FAIR, ED @ 2017-11-29 16:27 UTC (permalink / raw)
  To: netfilter

Hi,

Are there any members here successfully mangling SNMP requests/replies (udp 161)?  I'm trying to policy-route my outbound SNMP requests, but my efforts have been unsuccessful to date.  I'd like to hear how you do it.

I have two interfaces in play; I do not have routing turned on; bond0.1 is used for the default route (main table); I would like to policy-route just the locally-generated SNMP requests via bond0.2 towards a NAT device.  So I use:

	ip route add to unicast default table 7 via 192.168.168.7 dev bond0.2 src 192.168.168.3   #192.168.168.7 is a NAT server, 192.168.168.3 is the address assigned to bond0.2
	iptables -t mangle -A OUTPUT -p udp --dport 161 -j MARK --set-mark 256
	ip rule add priority 9999 type unicast fwmark 256 table 7
	ip route flush cache table 7

In the above configuration, the SNMP requests correctly egress via bond0.2 - the policy-routing is having some effect - but the requests retain the bond0.1 address in the IP SRC - the policy-routing doesn't update the IP SRC as I had hoped.  

For testing, I'm using net-snmp-utils "snmpget" command, with no "clientaddr" specified.

Thanks in Advance!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SNMP mangling anybody?
  2017-11-29 16:27 SNMP mangling anybody? FAIR, ED
@ 2017-11-30 11:02 ` Robert White
  2017-11-30 14:12   ` FAIR, ED
  2017-12-01 17:58   ` FAIR, ED
  0 siblings, 2 replies; 11+ messages in thread
From: Robert White @ 2017-11-30 11:02 UTC (permalink / raw)
  To: FAIR, ED, netfilter

You need to bind the socket to the specific interface using your
application.

For snmpget, for example, you can use the snmp.conf file or just add the
clientaddr option on the command line.

so using:

snmpget --clientaddr=udp:192.168.168.3 (rest of stuff)

would bind the get to the local address, and so interface, you
specified. You seem to already know about this but want to avoid it for
some reason.

You are correct that policy routing does not rewrite packet addresses.
That's not what it's for. Rule based nat can do it, but that's not your
best option.

If you want to rewrite the addres, then SNAT the packet (that's what
snat is for, ha ha ha).

But really, unless you have some reason no to, you should use the most
native tool (the clientaddr option or similar) instead of getting "tricky".

As Scotty famously said "the more you over-think the plumbing, the
easier it is to stop up the pipes."

You can use the mark to limit the snat in postrouting, presuming you're
getting the packets marked properly.


iptables --append POSTROUTING \
  --match mark --mark 256 \
  --jump SNAT --to-source 192.168.168.7

But seriously, clientaddr is your best option.

--Rob

On 11/29/2017 04:27 PM, FAIR, ED wrote:
> Hi,
> 
> Are there any members here successfully mangling SNMP requests/replies (udp 161)?  I'm trying to policy-route my outbound SNMP requests, but my efforts have been unsuccessful to date.  I'd like to hear how you do it.
> 
> I have two interfaces in play; I do not have routing turned on; bond0.1 is used for the default route (main table); I would like to policy-route just the locally-generated SNMP requests via bond0.2 towards a NAT device.  So I use:
> 
> 	ip route add to unicast default table 7 via 192.168.168.7 dev bond0.2 src 192.168.168.3   #192.168.168.7 is a NAT server, 192.168.168.3 is the address assigned to bond0.2
> 	iptables -t mangle -A OUTPUT -p udp --dport 161 -j MARK --set-mark 256
> 	ip rule add priority 9999 type unicast fwmark 256 table 7
> 	ip route flush cache table 7
> 
> In the above configuration, the SNMP requests correctly egress via bond0.2 - the policy-routing is having some effect - but the requests retain the bond0.1 address in the IP SRC - the policy-routing doesn't update the IP SRC as I had hoped.  
> 
> For testing, I'm using net-snmp-utils "snmpget" command, with no "clientaddr" specified.
> 
> Thanks in Advance!
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-11-30 11:02 ` Robert White
@ 2017-11-30 14:12   ` FAIR, ED
  2017-12-01 17:58   ` FAIR, ED
  1 sibling, 0 replies; 11+ messages in thread
From: FAIR, ED @ 2017-11-30 14:12 UTC (permalink / raw)
  To: Robert White, netfilter

Thank you for the reply Robert, most appreciated.

While testing with snmpget I monitor traffic at 1) the SNMP manager and 2) the NAT device with: "tcpdump -n -e arp or port 161".

Setting the SNMP clientaddr seems to have the desired effect on the IP SRC in the outbound SNMP request, and the policy-route has the desired effect to send to the desired nexthop via the correct interfrace.  But it still doesn't work in the end:  tcpdump shows the SNMP request and the SNMP response exactly as expected at both 1 and 2 above,  but the snmpget command shows a Timeout error (I use -t 5 which is a gracious plenty).  The SNMP response is not making it up the stack to the application layer.  I don't know where/why it's being dropped. I inserted a gratuitous ACCEPT in the filter table INPUT chain, but the packet count remains zero, so I think it's dropped by the kernel before netfilter sees it.  (To further support this belief, if I change the SNMP manager to default route via bond0.2, everything works perfectly - no timeout).

>>> Rule based nat can do it, but that's not your best option. <<< 

I thought this was deprecated?

>>> If you want to rewrite the address, then SNAT the packet (that's what snat is for, ha ha ha). <<<

I have spent some time configuring and testing with SNAT too, and I will likely return to that after I understand what's happening in the much simpler case of just setting clientaddr as above.

>>> iptables --append POSTROUTING   --match mark --mark 256   --jump SNAT --to-source 192.168.168.7 <<<

192.168.168.7 is the NAT device, shouldn't that be the bond0.2 address of the SNMP manager (192.168.168.3)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-11-30 11:02 ` Robert White
  2017-11-30 14:12   ` FAIR, ED
@ 2017-12-01 17:58   ` FAIR, ED
  2017-12-02  2:03     ` Robert White
  1 sibling, 1 reply; 11+ messages in thread
From: FAIR, ED @ 2017-12-01 17:58 UTC (permalink / raw)
  To: Robert White, netfilter

Robert,

Some progress on this - continuing with the "clientaddr" approach only - no SNAT - changing /proc/sys/net/ipv4/conf/bond0.2/rp_filter=2 (was 1) stopped the blocking of the SNMP response.  See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/ip-sysctl.txt?id=bb077d600689dbf9305758efed1e16775db1c84c#n961

I don't fully understand why rp thinks a default route in table main is better than a default route in my table, but - clearly - it's not a netfilter thing.

I'll try using SNAT and will report final results back.

ed

-----Original Message-----
From: netfilter-owner@vger.kernel.org [mailto:netfilter-owner@vger.kernel.org] On Behalf Of Robert White
Sent: Thursday, November 30, 2017 6:03 AM
To: FAIR, ED <ef7193@att.com>; netfilter@vger.kernel.org
Subject: Re: SNMP mangling anybody?

You need to bind the socket to the specific interface using your
application.

For snmpget, for example, you can use the snmp.conf file or just add the
clientaddr option on the command line.

so using:

snmpget --clientaddr=udp:192.168.168.3 (rest of stuff)

would bind the get to the local address, and so interface, you
specified. You seem to already know about this but want to avoid it for
some reason.

You are correct that policy routing does not rewrite packet addresses.
That's not what it's for. Rule based nat can do it, but that's not your
best option.

If you want to rewrite the addres, then SNAT the packet (that's what
snat is for, ha ha ha).

But really, unless you have some reason no to, you should use the most
native tool (the clientaddr option or similar) instead of getting "tricky".

As Scotty famously said "the more you over-think the plumbing, the
easier it is to stop up the pipes."

You can use the mark to limit the snat in postrouting, presuming you're
getting the packets marked properly.


iptables --append POSTROUTING \
  --match mark --mark 256 \
  --jump SNAT --to-source 192.168.168.7

But seriously, clientaddr is your best option.

--Rob

On 11/29/2017 04:27 PM, FAIR, ED wrote:
> Hi,
> 
> Are there any members here successfully mangling SNMP requests/replies (udp 161)?  I'm trying to policy-route my outbound SNMP requests, but my efforts have been unsuccessful to date.  I'd like to hear how you do it.
> 
> I have two interfaces in play; I do not have routing turned on; bond0.1 is used for the default route (main table); I would like to policy-route just the locally-generated SNMP requests via bond0.2 towards a NAT device.  So I use:
> 
> 	ip route add to unicast default table 7 via 192.168.168.7 dev bond0.2 src 192.168.168.3   #192.168.168.7 is a NAT server, 192.168.168.3 is the address assigned to bond0.2
> 	iptables -t mangle -A OUTPUT -p udp --dport 161 -j MARK --set-mark 256
> 	ip rule add priority 9999 type unicast fwmark 256 table 7
> 	ip route flush cache table 7
> 
> In the above configuration, the SNMP requests correctly egress via bond0.2 - the policy-routing is having some effect - but the requests retain the bond0.1 address in the IP SRC - the policy-routing doesn't update the IP SRC as I had hoped.  
> 
> For testing, I'm using net-snmp-utils "snmpget" command, with no "clientaddr" specified.
> 
> Thanks in Advance!
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwICaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=EcfiRiy5hyhMv-HmmFZ9UQ&m=ErY6jrDvV-IC87sFqWwJ3OC9zr-csqR8c-9DHMubxjE&s=tQJcdPEWjM68iFLr0luAH8uSL0Lw_XpnCKtepViM084&e= 
> 

--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwICaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=EcfiRiy5hyhMv-HmmFZ9UQ&m=ErY6jrDvV-IC87sFqWwJ3OC9zr-csqR8c-9DHMubxjE&s=tQJcdPEWjM68iFLr0luAH8uSL0Lw_XpnCKtepViM084&e= 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SNMP mangling anybody?
  2017-12-01 17:58   ` FAIR, ED
@ 2017-12-02  2:03     ` Robert White
  2017-12-04 14:36       ` FAIR, ED
  0 siblings, 1 reply; 11+ messages in thread
From: Robert White @ 2017-12-02  2:03 UTC (permalink / raw)
  To: FAIR, ED, netfilter

On 12/01/2017 05:58 PM, FAIR, ED wrote:
> Robert,
> 
> Some progress on this - continuing with the "clientaddr" approach only - no SNAT - changing /proc/sys/net/ipv4/conf/bond0.2/rp_filter=2 (was 1) stopped the blocking of the SNMP response.  See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/ip-sysctl.txt?id=bb077d600689dbf9305758efed1e16775db1c84c#n961
> 
> I don't fully understand why rp thinks a default route in table main is better than a default route in my table, but - clearly - it's not a netfilter thing.

er...

What are your other interface addresses? (you never said.)

If you are choking on rp filter then there's probably a fundamental
error in your network layout. I mentally skipped over this as a
possibility in my previous reply.

For example if your bond0.1 and bond0.2 -- which I am assuming are vlans
-- are 192.168.168.2 and 192.168.168.3 respectively then the reverse
path filter will have all sorts of problems with things. So will
literally everything else that can see the pathing conflict.

A vlan segment must be treated as a distinct segment as far as network
design.

In point of fact, if you've got the design right you shouldn't need to
do anything with rules or specialty routing or clientaddr or snat. The
"best" local address should be automatically assigned to the outgoing
packet because it would be the only valid choice.

The advanced routing rules pretty much don't matter for locally attached
segments because the network stack would collapse if it didn't
automatically handle such segments.

The real point of advanced routing is to direct things to use a specific
first hop when they are going more than one hop.

So at this point I have to assume you are making an "advanced" (in the
Invader Zim sense) error.

Such Errors Include:

Not flushing addresses from interfaces before bonding them.

Not flushing addresses before bridging them.

Trying to coerce use of a specific adapter that is part of a bond.

Assigning multiple physical and/or semantic adapters to the same
routable segment address.

"Clever" netmasks that cause two or more segments to overlap.

---

Once you get your network topology straightened out you'll probably find
that you don't need to do _anything_ besides assign addresses to your
adapters and use the desired target addresses in your query. You
shouldn't even need clientaddr.

> I'll try using SNAT and will report final results back.

Don't bother to try SNAT yet, you have bigger problems.


--Rob.

P.S. yes, rule based nat is depreciated, which is why it's not your
friend. 8-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-12-02  2:03     ` Robert White
@ 2017-12-04 14:36       ` FAIR, ED
  2017-12-11 18:40         ` Robert White
  0 siblings, 1 reply; 11+ messages in thread
From: FAIR, ED @ 2017-12-04 14:36 UTC (permalink / raw)
  To: Robert White, netfilter

Rob,

bond0.1 IP=10.36.22.77/24, bond0.2 IP=192.168.168.84/24.  See diagram and ip configs below.

I am prone to advanced errors, yes :) And I am puzzled by the rp_filter RFC3704 behavior.

I want all locally-generated traffic *except locally-generated SNMP* to route via the default route in table "main", egress bond0.1; I want *locally-generated SNMP* to route via the default route in table 7, egress bond0.2.

Perhaps I have ignorantly misconfigured something to bias rp_filter?  It seems that rp_filter does not treat the two different default routes in two different tables as equals.

The below renders best in a fixed font BTW.


----------8<-------------------

hostA-C and nat1 are all contained within a single chassis; RTR is not within the chassis.

   +-------------- INTERNAL network, VLAN 2, 192.168.0.0/24, bond0.2 on all hosts.  This VLAN is internal to the chassis and is not routable outside the chassis.
   |
   |
   |             +----- EXTERNAL network, VLAN 1, 10.0.0.0/8 network, bond0.1 on all hosts.  This VLAN has external 
   |             |
   |             |
   |             |
   V             V

   ~             ~
   |  +-------+  |
   |  |       |  |
   +--| hostA |--+
   |  |       |  |
   |  +-------+  |
   |             |
   |  +-------+  |
   |  |       |  |
   +--| hostB |--+   Notes: 
   |  |       |  |      
   |  +-------+  |      hostA-C are not forwarding/routing.
   |             |      other network interfaces omitted for clarity (SAN, DR, etc.).
   |  +-------+  |      
   |  |       |  |
   +--| hostC |--+
   |  |       |  |
   |  +-------+  |
   |             |             ~
   |      .      |   +------+  |  (((((((((((((()))))))))))))) 
   |      .      |   |      |  |  (                          )
   |      .      +---| RTR  |--+--( the rest of the network  )
   |             |   |      |  |  (                          )
   |  +-------+  |   +------+  |  (((((((((((((())))))))))))))
   |  |       |  |             ~
   +--| nat1  |--+
   |  |       |  |
   |  +-------+  |
   ~             ~



$ ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:6a:fd:fd brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:6a:fd:fc brd ff:ff:ff:ff:ff:ff
4: eth6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:6a:fd:ff brd ff:ff:ff:ff:ff:ff
5: eth7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:6a:fd:fe brd ff:ff:ff:ff:ff:ff
6: eth8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:1b:21:d7:2c:51 brd ff:ff:ff:ff:ff:ff
    inet 192.168.93.24/23 brd 192.168.93.255 scope global eth8
    inet6 fe80::21b:21ff:fed7:2c51/64 scope link
       valid_lft forever preferred_lft forever
7: eth9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:d7:2c:50 brd ff:ff:ff:ff:ff:ff
8: eth10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:d7:2c:53 brd ff:ff:ff:ff:ff:ff
9: eth11: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:1b:21:d7:2c:52 brd ff:ff:ff:ff:ff:ff
10: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 00:1b:21:d8:7b:fc brd ff:ff:ff:ff:ff:ff
11: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 00:1b:21:d8:7b:fc brd ff:ff:ff:ff:ff:ff
12: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:21:28:de:49:f2 brd ff:ff:ff:ff:ff:ff
13: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:21:28:de:49:f3 brd ff:ff:ff:ff:ff:ff
14: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:1b:21:d8:7b:fc brd ff:ff:ff:ff:ff:ff
    inet6 fe80::21b:21ff:fed8:7bfc/64 scope link
       valid_lft forever preferred_lft forever
15: bond0.1@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:1b:21:d8:7b:fc brd ff:ff:ff:ff:ff:ff
    inet 10.36.22.77/24 brd 10.36.22.255 scope global bond0.1
    inet6 fe80::21b:21ff:fed8:7bfc/64 scope link
       valid_lft forever preferred_lft forever
16: bond0.2@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:1b:21:d8:7b:fc brd ff:ff:ff:ff:ff:ff
    inet 192.168.168.84/24 brd 192.168.168.255 scope global bond0.2
    inet6 fe80::21b:21ff:fed8:7bfc/64 scope link
       valid_lft forever preferred_lft forever
$ ip route show
192.168.168.0/24 dev bond0.2  proto kernel  scope link  src 192.168.168.84
10.36.22.0/24 dev bond0.1  proto kernel  scope link  src 10.36.22.77
192.168.92.0/23 dev eth8  proto kernel  scope link  src 192.168.93.24
169.254.0.0/16 dev eth8  scope link  metric 1006
169.254.0.0/16 dev bond0  scope link  metric 1014
169.254.0.0/16 dev bond0.1  scope link  metric 1015
169.254.0.0/16 dev bond0.2  scope link  metric 1016
default via 10.36.22.1 dev bond0.1
$ ip route show table 7
default via 192.168.168.93 dev bond0.2  src 192.168.168.84
$


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SNMP mangling anybody?
  2017-12-04 14:36       ` FAIR, ED
@ 2017-12-11 18:40         ` Robert White
  2017-12-11 21:07           ` FAIR, ED
  0 siblings, 1 reply; 11+ messages in thread
From: Robert White @ 2017-12-11 18:40 UTC (permalink / raw)
  To: FAIR, ED, netfilter

First, if you are using VLAN 2 as an _internal_ network, why are you
giving it a default route at all. Simply using the 192.168.0.X addresses
will put you on the right link. That link doesn't leave. "Default"
routes are only for finding a way to non-local segements, e.g. the
larger internet.

I'm confused as to how/why you have 192.168.0.0/24 as a descriptor when
you are variously using 192.168.168.84/24 and such. the /24 locks all
the bits of the first three octets as the host part so 192.168.168.84 is
_not_ accessible from 192.168.0.0/24 except via a router.

So it appears that you've drastically over-thought your addressing and
made a mess. Or you've made a typo.

I also have no understanding of your desire to _bond_ the interfaces.

So here's the deal...

This looks like some of the ATCA stuff I've worked with.

You seem to be bonding two interfaces (eth0 and eth1)

You talk about VLAN1 and VLAN2, so I am assuming you have some sort of
switch in the chassis that is letting one VLAN out into the world (e.g.
RTR) and keeping the other private.

So if you strip out all the nonsense and just put the addresses on
bond0.1 and bond0.2, then only generate traffic to 192.168.168.anything
addresses when you want to use bond0.2 then everything is finished.

The main routing tables that control all the access to all the locally
attached segments is inviolate because things don't work at all if the
user can bone finding 192.168.168.X/24 via the wire directly attached to
the 192.168.168.Y/24 adapter.

On 12/04/2017 02:36 PM, FAIR, ED wrote:

So rp is upchucking because (first line of main table)
> 192.168.168.0/24 \
>   dev bond0.2 \
>   proto kernel \
>   scope link \
>   src 192.168.168.84

is just basic plumbing. It's unavoidable. And it has nothing to do with
the default routes at all.

So to say it in different words, "default" in "default route" is the
entry to use for "not otherwise listed" address ranges. But the
192.168.168.0/24 range is explicitly listed.

In the end you've outsmarted yourself with a few classic misunderstandings.

To acheive what you want:

(1) don't mess with the routing tables at all. The default route that is
added, with the via naming the address of RTR, plus all the local rules,
is all you need, and it only applies for larger internet-world destinations.

(2) you really _ought_ to use the clientaddr config, but you don't have
to really do that as the src stanza of the above citation has you covered.

(3) you don't need policy when your problem set is already constrained
by addressing. So you don't need to do _anything_ with "ip route", "ip
rule", iptables, or nft to get what you want to happen.

When you use a 192.168.168.something target address it _will_ go out on
the correct vlan using the correct address, as that's what thous
unavoidable routes _do_ for you. This will affect any application using
any address in that matching subnet.

If this _isn't_ working for you then go back to the switch config and
that line where you said vlan 2 was for the 192.168.0.0/24 range of
addresses. If your switch is actively filtering that vlan for that
address range then you need to reconcile that by changing your
addresses, changing the filter in the switch, or using some other vlan
number.



But in general, for locally attached things, and in the absence of bad
actors, just let the magic happen. 8-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-12-11 18:40         ` Robert White
@ 2017-12-11 21:07           ` FAIR, ED
  2017-12-12  8:44             ` André Paulsberg-Csibi (IBM Consultant)
  0 siblings, 1 reply; 11+ messages in thread
From: FAIR, ED @ 2017-12-11 21:07 UTC (permalink / raw)
  To: Robert White, netfilter

Robert, thanks again for your thoughtful reply.  I am only using the net-snmp command-line utilities as a lingua franca.  In reality, my SNMP application does not use the net-snmp libraries and thus cannot use clientaddr (and it has no similar functionality to net-snmp's clientaddr).

I'm still looking in the code to find where/why rp_filter is deciding to drop the SNMP replies by default.  Any insight how I might trace this?

I did make a typo in my ASCII drawing - in the line 3 comment - my profuse apologies.   I incorrectly showed the prefix as 192.168.0.0/24, I should've shown as 192.168.168.0 - corrected below.  I also updated the comment in line 6 to be more descriptive.

I am indeed using VLAN 2 as an internal network within the local chassis only; the idea is that one blade in the chassis (nat1) functions as a NAT translator; if any other blade in the chassis (hostA, B, or C... ) wants it's outbound traffic translated it can use nat1 for its next hop.  And these two points are most important: 1) I don't want any outbound traffic from hostA, B, C translated by nat1 except their outbound SNMP requests; 2) I do not want any inbound traffic to hostA, B, C other than SNMP responses to traverse the NAT.

I'm bonding eth0 and eth1 because I require hardware redundancy on VLAN 1. It's not really germane to the conversation about netfilter, policy routing, or SNMP though, so let's not worry about that (unless you feel it necessary).

As you correctly conclude, my chassis includes not just one but two independent internal ethernet switches, switch0 and switch1; each switch has two VLAN's configured:  VLAN #1 and VLAN #2.  In each switch VLAN1 is connected to and routed "to the rest of the world" via RTR (which in reality is an HSRP pair); VLAN2 is not connected beyond the chassis and is not routed.  Each blade in the chassis has a single ethernet link to each internal switch (eth0  to switch0, eth1 to switch1).  And the switches are not really filtering anything (other than malformed frames).

>>> So if you strip out all the nonsense and just put the addresses on bond0.1 and bond0.2, then only generate traffic to 192.168.168.anything addresses when you want to use bond0.2 then everything is finished.<<<

The SNMP agents I'm polling are all in 10.0.0.0/8.  I'm not seeing how I can avoid a second routing table, using ip rule, iptables,  nft, while steering just the outbound SNMP polling traffic to the rest of the world to the NAT blade.  It's seems as clear a case for "policy-based routing" as one can draw:  "if locally-generated traffic is non-SNMP, use RTR for the outbound next-hop; if locally-generated traffic is SNMP, use nat1 for the outbound next-hop".

If I only configure the IP addresses as you suggest in hostA,B,C... with no mention of nat1's IP address, the kernel on the host will never select bond0.2 for outbound polls to 10.0.0.0/8.

----------8<-------------------

hostA-C and nat1 are all contained within a single chassis; RTR is not within the chassis.

   +-------------- INTERNAL network, VLAN 2, 192.168.168.0/24, bond0.2 on all hosts.  This VLAN is internal to the chassis and is not routable outside the chassis.
   |
   |
   |             +----- EXTERNAL network, VLAN 1, 10.0.0.0/8 network, bond0.1 on all hosts.  This VLAN has external connectivity via RTR
   |             |
   |             |
   |             |
   V             V

   ~             ~
   |  +-------+  |
   |  |       |  |
   +--| hostA |--+
   |  |       |  |
   |  +-------+  |
   |             |
   |  +-------+  |
   |  |       |  |
   +--| hostB |--+   Notes: 
   |  |       |  |      
   |  +-------+  |      hostA-C are not forwarding/routing.
   |             |      other network interfaces omitted for clarity (SAN, DR, etc.).
   |  +-------+  |      
   |  |       |  |
   +--| hostC |--+
   |  |       |  |
   |  +-------+  |
   |             |             ~
   |      .      |   +------+  |  (((((((((((((()))))))))))))) 
   |      .      |   |      |  |  (                          )
   |      .      +---| RTR  |--+--( the rest of the network  )
   |             |   |      |  |  (                          )
   |  +-------+  |   +------+  |  (((((((((((((())))))))))))))
   |  |       |  |             ~
   +--| nat1  |--+
   |  |       |  |
   |  +-------+  |
   ~             ~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-12-11 21:07           ` FAIR, ED
@ 2017-12-12  8:44             ` André Paulsberg-Csibi (IBM Consultant)
  2017-12-12 17:57               ` FAIR, ED
  0 siblings, 1 reply; 11+ messages in thread
From: André Paulsberg-Csibi (IBM Consultant) @ 2017-12-12  8:44 UTC (permalink / raw)
  To: netfilter; +Cc: 'FAIR, ED'

Just a few questions question , as this makes me unsure as to why you want this setup .

1 . What is the "BOX" nat1 , just a plain VM ( virtual machine )

2. Why are your internal chassis network not just a plain BRIDGE , is there any reason for using a VLAN ?

And the most obvious question
3. Why do you not want the SNMP request with external source IP to leave the HOSTS on internal then loop around via nat1 back ?
     ( if it is going out externally anyway , why not just let it go the normal shortest/fastest way )


So to some stuff that I would suggest could cause the issue , but I am not sure since I do not have the complete picture .
Normally a host would choose outgoing interface based on your local routes , so if "hostA" sends an SNMP  where targets are on the inside
and then use the IP on that same interface and send to whatever "router" you have in the routing table on that same interface .

Now for your scenario , as I understand it ( and there are some assumptions here , I apologies ):

STEP 1 : Your client software will get a request you made  , with a destination IP it will send to outside your own 2 networks .
STEP 2 : Since a packet needs to be generated from this part of the client softtware , it "somehow" determines what interface it should use based on the routing of table "main"
               ( As it is not aware of the iptable MANGLE rules it cannot choose other ).
STEP 3 : Packet is created and sent through the IP STACK and reaches MANGLE table , already using the SOURCE IP of interface the client chose prior based on table "main" outgoing interface
STEP 4 : Packet is marked so that later on the routing will change for this packet , but it does not MANLGE the SOURCE IP .
STEP 5 : Packet is now routed the opposite interface with a spoofed SOURCE IP , not mathcing the subnet it is comming from even if routing allows for the destination in RP_FILTER your SOURCE IP is only allowed on the originating interface ( which is the external side of your host[ABC] ).


IF my undersanding of host[ABC] routing is correct , this is the "better" way of doing things ( again apoligies if I made some incorrect understanding and/or assumptions )

If you have 2 interfaces , and you have ROUTING disabled you should be able to do 2 things ( huge assumption , not tested ).
1. You should be able to have 2 default routes in each interface subnet
2. You host[ABC] should route packets based on origin interface

IF my assumption is incorrect , it would require 2 routing istances - but then the effect should still be the same as above if my understanding is correct .


Best regards
André Paulsberg-Csibi
Senior Network Engineer 
IBM Services AS

-----Original Message-----
From: netfilter-owner@vger.kernel.org [mailto:netfilter-owner@vger.kernel.org] On Behalf Of FAIR, ED
Sent: Monday, December 11, 2017 10:07 PM
To: Robert White <rwhite@pobox.com>; netfilter@vger.kernel.org
Subject: RE: SNMP mangling anybody?

Robert, thanks again for your thoughtful reply.  I am only using the net-snmp command-line utilities as a lingua franca.  In reality, my SNMP application does not use the net-snmp libraries and thus cannot use clientaddr (and it has no similar functionality to net-snmp's clientaddr).

I'm still looking in the code to find where/why rp_filter is deciding to drop the SNMP replies by default.  Any insight how I might trace this?

I did make a typo in my ASCII drawing - in the line 3 comment - my profuse apologies.   I incorrectly showed the prefix as 192.168.0.0/24, I should've shown as 192.168.168.0 - corrected below.  I also updated the comment in line 6 to be more descriptive.

I am indeed using VLAN 2 as an internal network within the local chassis only; the idea is that one blade in the chassis (nat1) functions as a NAT translator; if any other blade in the chassis (hostA, B, or C... ) wants it's outbound traffic translated it can use nat1 for its next hop.  And these two points are most important: 1) I don't want any outbound traffic from hostA, B, C translated by nat1 except their outbound SNMP requests; 2) I do not want any inbound traffic to hostA, B, C other than SNMP responses to traverse the NAT.

I'm bonding eth0 and eth1 because I require hardware redundancy on VLAN 1. It's not really germane to the conversation about netfilter, policy routing, or SNMP though, so let's not worry about that (unless you feel it necessary).

As you correctly conclude, my chassis includes not just one but two independent internal ethernet switches, switch0 and switch1; each switch has two VLAN's configured:  VLAN #1 and VLAN #2.  In each switch VLAN1 is connected to and routed "to the rest of the world" via RTR (which in reality is an HSRP pair); VLAN2 is not connected beyond the chassis and is not routed.  Each blade in the chassis has a single ethernet link to each internal switch (eth0  to switch0, eth1 to switch1).  And the switches are not really filtering anything (other than malformed frames).

>>> So if you strip out all the nonsense and just put the addresses on bond0.1 and bond0.2, then only generate traffic to 192.168.168.anything addresses when you want to use bond0.2 then everything is finished.<<<

The SNMP agents I'm polling are all in 10.0.0.0/8.  I'm not seeing how I can avoid a second routing table, using ip rule, iptables,  nft, while steering just the outbound SNMP polling traffic to the rest of the world to the NAT blade.  It's seems as clear a case for "policy-based routing" as one can draw:  "if locally-generated traffic is non-SNMP, use RTR for the outbound next-hop; if locally-generated traffic is SNMP, use nat1 for the outbound next-hop".

If I only configure the IP addresses as you suggest in hostA,B,C... with no mention of nat1's IP address, the kernel on the host will never select bond0.2 for outbound polls to 10.0.0.0/8.

----------8<-------------------

hostA-C and nat1 are all contained within a single chassis; RTR is not within the chassis.

   +-------------- INTERNAL network, VLAN 2, 192.168.168.0/24, bond0.2 on all hosts.  This VLAN is internal to the chassis and is not routable outside the chassis.
   |
   |
   |             +----- EXTERNAL network, VLAN 1, 10.0.0.0/8 network, bond0.1 on all hosts.  This VLAN has external connectivity via RTR
   |             |
   |             |
   |             |
   V             V

   ~             ~
   |  +-------+  |
   |  |       |  |
   +--| hostA |--+
   |  |       |  |
   |  +-------+  |
   |             |
   |  +-------+  |
   |  |       |  |
   +--| hostB |--+   Notes: 
   |  |       |  |      
   |  +-------+  |      hostA-C are not forwarding/routing.
   |             |      other network interfaces omitted for clarity (SAN, DR, etc.).
   |  +-------+  |      
   |  |       |  |
   +--| hostC |--+
   |  |       |  |
   |  +-------+  |
   |             |             ~
   |      .      |   +------+  |  (((((((((((((()))))))))))))) 
   |      .      |   |      |  |  (                          )
   |      .      +---| RTR  |--+--( the rest of the network  )
   |             |   |      |  |  (                          )
   |  +-------+  |   +------+  |  (((((((((((((())))))))))))))
   |  |       |  |             ~
   +--| nat1  |--+
   |  |       |  |
   |  +-------+  |
   ~             ~
\x13��칻\x1c�&�~�&�\x18��+-��ݶ\x17��w��˛���m�޵������^n�r���z�\x1a��h����&��\x1e�G���h�\x03(�階�ݢj"��\x1a�^[m�����z�ޖ���f���h���~�m�

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-12-12  8:44             ` André Paulsberg-Csibi (IBM Consultant)
@ 2017-12-12 17:57               ` FAIR, ED
  2017-12-13 11:50                 ` André Paulsberg-Csibi (IBM Consultant)
  0 siblings, 1 reply; 11+ messages in thread
From: FAIR, ED @ 2017-12-12 17:57 UTC (permalink / raw)
  To: André Paulsberg-Csibi (IBM Consultant), netfilter

>>> 1 . What is the "BOX" nat1 , just a plain VM ( virtual machine ) <<<

Just a plain Linux machine, 2.6.32 or later kernel, same OS as host[ABC], but with ip_forward=1 and with iptables entries for NAT:

	iptables -t nat -A POSTROUTING -o bond0.1 -j MASQUERADE
	iptables -A FORWARD -i bond0.1 -o bond0.2 -m state --state RELATED,ESTABLISHED -j ACCEPT
	iptables -A FORWARD -i bond0.2 -o bond0.1 -j ACCEPT
	# plus some filters to drop non-SNMP traffic

>>> 2. Why are your internal chassis network not just a plain BRIDGE , is there any reason for using a VLAN ? <<<

It is "plain bridge" with 802.1q VLAN.  I have never configured NAT translation using bridged interfaces, always routed.  Is this even possible?

>>> 3. Why do you not want the SNMP request with external source IP to leave the HOSTS on internal then loop around via nat1 back ?     ( if it is going out externally anyway , why not just let it go the normal shortest/fastest way ) <<<

If I understand your question, Several advantages:

- To reduce the number of entries in SNMP ACL's in the "rest of the network".  In this example, reducing ACL entries from 3 to 1.
- To allow insertion of additional SNMP managers without editing SNMP ACL's in the "rest of the network".  In this example,  adding two new managers host[DE] is transparent to the "rest of the network".
- To allow relocation of SNMP managers without updating SNMP ACL's in the "rest of the network".  Example:  failover of all SNMP managers (host[ABC]) from one city to another due to a disaster.

In the case of just three managers (host[ABC]) the advantages are not so great, but in the case of, say, 26  managers (host[A-Z]) the advantage becomes significant.  In reality, the scale will be 5-20 managers per NAT, perhaps greater if the conntrack performance is acceptable.  



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: SNMP mangling anybody?
  2017-12-12 17:57               ` FAIR, ED
@ 2017-12-13 11:50                 ` André Paulsberg-Csibi (IBM Consultant)
  0 siblings, 0 replies; 11+ messages in thread
From: André Paulsberg-Csibi (IBM Consultant) @ 2017-12-13 11:50 UTC (permalink / raw)
  To: 'FAIR, ED', netfilter

> If I understand your question, Several advantages:
>
> - To reduce the number of entries in SNMP ACL's in the "rest of the network".  In this example, reducing ACL entries from 3 to 1.
> - To allow insertion of additional SNMP managers without editing SNMP ACL's in the "rest of the network".  In this example,  adding two new managers host[DE] is transparent to the "rest of the network".
> - To allow relocation of SNMP managers without updating SNMP ACL's in the "rest of the network".  Example:  failover of all SNMP managers (host[ABC]) from one city to another due to a disaster.

What you describe here seems that what you actually want is ANYCAST ( in reverse ), and have all host[A-Z] use same VIP .
https://en.wikipedia.org/wiki/Anycast

An ANYCAST setup like described would allow both SNMP traps to be sent to ONLY one IP from where ever you want would end up at the closest available host[A-Z],
And all SNMP request from any of the host[A-Z] would then use the same IP as their source when sending out ( this takes additional setup for ANYCAST to allow reverse traffic )

However for the basic implementation , having this setup would normally  requiring some routing protocols ( I do not know if this is a showstopper ).
IF not I would suggest doing NAT on the RTR unit , or move NAT1 to be either a parallel router with RTR or a "router on a stick" simplifying the host[A-Z] setup .
( again not knowing your setup , I just suggest what would make for a more simple and basis design *with less plumbing* )

SIDENOTE : about your "bridge" setup I have just never seen that with "bond*" as that would typically refer to a set of physical interfaces .
A bridge would typically be linked to either BOND or a VLAN on that said BRIDGE , not that this means your setup is wrong ( just I do not understand it )

What we typically see on all system setup by "us" or delivered to "us" , we see the following 2 variations :
The CHASSIS have its own HOST OS , this OS have PHYSICAL interfaces in a BOND not visible to underlaying VMs ,
If the chassis have 2 switches with 2 ports for each SERVERBLADE (4 ports in total ) , they might set that up as 2 BONDS .
The ONE or TWO bonds come as ONE or TWO SINGLE interfaces inside the VM ( and can be bonded again if you want/need )
And this either come as a trunk or access-port ( as mentioned with VLAN connected as yet another slave on the bridge )

However on your drawing your external interfaces is the only "bridge" having data going out of the chassis , where a BOND could make sense .
So it is confusing why there is a bond also on the internal interface , on the same bond none the less .
Normally there would be an internal bridge on the CHASSIS HOST OS , which would be a separate bridge and with no access to physical interfaces 
( unless you planned for it to go to another CHASSIS )

Again , your design might be a part of a "greater plan" and I do not have the full picture 😊


Best regards
André Paulsberg-Csibi
Senior Network Engineer 
IBM Services AS

-----Original Message-----
From: FAIR, ED [mailto:ef7193@att.com] 
Sent: Tuesday, December 12, 2017 6:57 PM
To: André Paulsberg-Csibi (IBM Consultant) <Andre.Paulsberg-Csibi@evry.com>; netfilter@vger.kernel.org
Subject: RE: SNMP mangling anybody?

>>> 1 . What is the "BOX" nat1 , just a plain VM ( virtual machine ) <<<

Just a plain Linux machine, 2.6.32 or later kernel, same OS as host[ABC], but with ip_forward=1 and with iptables entries for NAT:

	iptables -t nat -A POSTROUTING -o bond0.1 -j MASQUERADE
	iptables -A FORWARD -i bond0.1 -o bond0.2 -m state --state RELATED,ESTABLISHED -j ACCEPT
	iptables -A FORWARD -i bond0.2 -o bond0.1 -j ACCEPT
	# plus some filters to drop non-SNMP traffic

>>> 2. Why are your internal chassis network not just a plain BRIDGE , is there any reason for using a VLAN ? <<<

It is "plain bridge" with 802.1q VLAN.  I have never configured NAT translation using bridged interfaces, always routed.  Is this even possible?

>>> 3. Why do you not want the SNMP request with external source IP to leave the HOSTS on internal then loop around via nat1 back ?     ( if it is going out externally anyway , why not just let it go the normal shortest/fastest way ) <<<

If I understand your question, Several advantages:

- To reduce the number of entries in SNMP ACL's in the "rest of the network".  In this example, reducing ACL entries from 3 to 1.
- To allow insertion of additional SNMP managers without editing SNMP ACL's in the "rest of the network".  In this example,  adding two new managers host[DE] is transparent to the "rest of the network".
- To allow relocation of SNMP managers without updating SNMP ACL's in the "rest of the network".  Example:  failover of all SNMP managers (host[ABC]) from one city to another due to a disaster.

In the case of just three managers (host[ABC]) the advantages are not so great, but in the case of, say, 26  managers (host[A-Z]) the advantage becomes significant.  In reality, the scale will be 5-20 managers per NAT, perhaps greater if the conntrack performance is acceptable.  



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-12-13 11:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-29 16:27 SNMP mangling anybody? FAIR, ED
2017-11-30 11:02 ` Robert White
2017-11-30 14:12   ` FAIR, ED
2017-12-01 17:58   ` FAIR, ED
2017-12-02  2:03     ` Robert White
2017-12-04 14:36       ` FAIR, ED
2017-12-11 18:40         ` Robert White
2017-12-11 21:07           ` FAIR, ED
2017-12-12  8:44             ` André Paulsberg-Csibi (IBM Consultant)
2017-12-12 17:57               ` FAIR, ED
2017-12-13 11:50                 ` André Paulsberg-Csibi (IBM Consultant)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.