netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Routing BUG with ppp over l2tp
@ 2014-10-20 16:39 Alan Stern
  2014-10-20 17:19 ` James Carlson
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Stern @ 2014-10-20 16:39 UTC (permalink / raw)
  To: James Chapman, Michal Ostrowski; +Cc: linux-ppp, netdev

James and Michal:

I'm having problem setting up a VPN connection that uses ppp over l2tp
over ipsec (this shows up under both 3.16 and 3.17-rc7).  The ipsec
part is working fine, and xl2tpd sets up its connection okay.  The
problem arises when ppp starts up.

As far as I can tell, the problem is caused by bad routing.  The kernel
gets confused because the IP address assigned by the VPN server to the
server's end of the ppp tunnel is the _same_ as the server's actual IP
address.

Here are some details.  My local address is 192.168.0.203 (behind a 
NAT-ing wireless router).  The VPN server is 140.247.233.37, as you can 
see from this entry in the system log:

Oct 13 17:10:27 saphir NetworkManager: xl2tpd[2616]: Connecting to host 140.247.233.37, port 1701

The addresses of the ppp tunnel endpoints are given later in the log:

Oct 13 17:10:30 saphir pppd[2618]: local  IP address 10.170.30.1
Oct 13 17:10:30 saphir pppd[2618]: remote IP address 140.247.233.37

The overall status from NetworkManager shows up in the log like this:

Oct 13 17:10:30 saphir NetworkManager: ** Message: L2TP service (IP Config Get) reply received.
Oct 13 17:10:30 saphir NetworkManager[439]: <info> VPN connection 'Rowland' (IP4 Config Get) reply received from old-style plugin.
Oct 13 17:10:30 saphir NetworkManager[439]: <info> VPN Gateway: 140.247.233.37
Oct 13 17:10:30 saphir NetworkManager[439]: <info> Tunnel Device: ppp0
Oct 13 17:10:30 saphir NetworkManager[439]: <info> IPv4 configuration:
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Internal Address: 10.170.30.1
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Internal Prefix: 32
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Internal Point-to-Point Address: 140.247.233.37
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Maximum Segment Size (MSS): 0
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Forbid Default Route: no
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Internal DNS: 10.160.0.2
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   Internal DNS: 10.160.0.3
Oct 13 17:10:30 saphir NetworkManager[439]: <info>   DNS Domain: '(none)'
Oct 13 17:10:30 saphir NetworkManager[439]: <info> No IPv6 configuration
Oct 13 17:10:30 saphir NetworkManager[439]: <info> VPN connection 'Rowland' (IP Config Get) complete.

Once the ppp tunnel was set up, xl2tpd started getting errors:

Oct 13 17:11:31 saphir NetworkManager: xl2tpd[2616]: network_thread: select timeout
Oct 13 17:11:32 saphir NetworkManager: xl2tpd[2616]: network_thread: select timeout
Oct 13 17:11:32 saphir NetworkManager: xl2tpd[2616]: Maximum retries exceeded for tunnel 33716.  Closing.
Oct 13 17:11:32 saphir NetworkManager: xl2tpd[2616]: Connection 147 closed to 140.247.233.37, port 1701 (Timeout)

Packet-level debugging showed that once I reached this stage, the
control messages sent by xl2tpd were not received by the server.  I 
believe this is because they were not routed correctly.

Unfortunately, at the moment I don't have a copy of the routing table.  
Nevertheless, it definitely appears that the packets xl2tpd wanted to
send directly to the VPN server were instead routed back through the
ppp tunnel!  Presumably this was because the routing table contained
two entries with their destinations both set to 140.247.233.37/32 (one
for the l2tp connection and one for the ppp tunnel), and the kernel
used the wrong entry.

In fact, on several occasions during testing, the system deadlocked.  I 
was able to get a stack dump:

[ 2214.970639] BUG: soft lockup - CPU#1 stuck for 22s! [pppd:9423]
[ 2214.970648] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc authenc cmac rmd160 crypto_null ip_vti ip_tunnel af_key ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm6_tunnel tunnel6 xfrm_ipcomp salsa20_i586 camellia_generic cast6_generic cast5_generic cast_common deflate cts gcm ccm serpent_sse2_i586 serpent_generic glue_helper blowfish_generic blowfish_common twofish_generic twofish_i586 twofish_common xcbc sha512_generic des_generic geode_aes tpm_rng tpm timeriomem_rng virtio_rng uas usb_storage fuse ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 ip6table_filter xt_conntrack ip6_tables nf_con
 ntrack vfat
[ 2214.970769]  fat snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel arc4 iwldvm snd_hda_controller snd_hda_codec uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev mac80211 snd_hwdep coretemp kvm_intel kvm media snd_seq snd_seq_device iTCO_wdt iTCO_vendor_support snd_pcm snd_timer snd joydev iwlwifi microcode serio_raw cfg80211 asus_laptop lpc_ich atl1c soundcore sparse_keymap rfkill input_polldev acpi_cpufreq binfmt_misc i915 i2c_algo_bit drm_kms_helper drm i2c_core video
[ 2214.970854] CPU: 1 PID: 9423 Comm: pppd Tainted: G        W     3.16.3-200.fc20.i686 #1
[ 2214.970860] Hardware name: ASUSTeK Computer Inc.         UL20A               /UL20A     , BIOS 207     11/02/2009
[ 2214.970866] task: f0706a00 ti: e359c000 task.ti: e359c000
[ 2214.970873] EIP: 0060:[<c0a077b8>] EFLAGS: 00200287 CPU: 1
[ 2214.970885] EIP is at _raw_spin_lock_bh+0x28/0x40
[ 2214.970890] EAX: e5ff02a4 EBX: e5ff02a4 ECX: 00000060 EDX: 0000005f
[ 2214.970895] ESI: e5ff02b0 EDI: e3470d40 EBP: e359dc34 ESP: e359dc34
[ 2214.970900]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 2214.970906] CR0: 8005003b CR2: b72eb000 CR3: 25f28000 CR4: 000407d0
[ 2214.970910] Stack:
[ 2214.970914]  e359dc9c f94efe2a f5140000 e359dc50 c045aba6 f5140000 00200286 e359dc78
[ 2214.970929]  c045c553 00000001 f5140000 001f9076 00200286 628a17c6 f075acc0 f075acc0
[ 2214.970942]  00000000 00200246 f4464848 00200246 00200246 e359dc9c e3470d84 f075acc0
[ 2214.970957] Call Trace:
[ 2214.970973]  [<f94efe2a>] ppp_push+0x32a/0x550 [ppp_generic]
[ 2214.970986]  [<c045aba6>] ? internal_add_timer+0x26/0x60
[ 2214.970994]  [<c045c553>] ? mod_timer_pending+0x63/0x130
[ 2214.971005]  [<f94f288d>] ppp_xmit_process+0x3cd/0x5e0 [ppp_generic]
[ 2214.971007]  [<c0914ae1>] ? harmonize_features+0x31/0x1d0
[ 2214.971007]  [<f94f2c78>] ppp_start_xmit+0x108/0x180 [ppp_generic]
[ 2214.971007]  [<c0915024>] dev_hard_start_xmit+0x2c4/0x540
[ 2214.971007]  [<c093244f>] sch_direct_xmit+0x9f/0x170
[ 2214.971007]  [<c091546a>] __dev_queue_xmit+0x1ca/0x430
[ 2214.971007]  [<c094c9b0>] ? ip_fragment+0x930/0x930
[ 2214.971007]  [<c09156df>] dev_queue_xmit+0xf/0x20
[ 2214.971007]  [<c091bacf>] neigh_direct_output+0xf/0x20
[ 2214.971007]  [<c094cb5a>] ip_finish_output+0x1aa/0x850
[ 2214.971007]  [<c094c9b0>] ? ip_fragment+0x930/0x930
[ 2214.971007]  [<c094dbbf>] ip_output+0x8f/0xe0
[ 2214.971007]  [<c094c9b0>] ? ip_fragment+0x930/0x930
[ 2214.971007]  [<c09a4f52>] xfrm_output_resume+0x342/0x3a0
[ 2214.971007]  [<c09a5013>] xfrm_output+0x43/0xf0
[ 2214.971007]  [<c0998f4d>] xfrm4_output_finish+0x3d/0x40
[ 2214.971007]  [<c0998e25>] __xfrm4_output+0x25/0x40
[ 2214.971007]  [<c0998f7f>] xfrm4_output+0x2f/0x70
[ 2214.971007]  [<c0998e00>] ? xfrm4_udp_encap_rcv+0x1b0/0x1b0
[ 2214.971007]  [<c094d2e7>] ip_local_out_sk+0x27/0x30
[ 2214.971007]  [<c094d5f4>] ip_queue_xmit+0x124/0x3f0
[ 2214.971007]  [<c0999f04>] ? xfrm_bundle_ok+0x64/0x170
[ 2214.971007]  [<c099a0ab>] ? xfrm_dst_check+0x1b/0x30
[ 2214.971007]  [<f94fd618>] l2tp_xmit_skb+0x298/0x4b0 [l2tp_core]
[ 2214.971007]  [<f950cd04>] pppol2tp_xmit+0x124/0x1d0 [l2tp_ppp]
[ 2214.971007]  [<f94f2adb>] ppp_channel_push+0x3b/0xb0 [ppp_generic]
[ 2214.971007]  [<f94f2d77>] ppp_write+0x87/0xc8 [ppp_generic]
[ 2214.971007]  [<f94f2cf0>] ? ppp_start_xmit+0x180/0x180 [ppp_generic]
[ 2214.971007]  [<c057723d>] vfs_write+0x9d/0x1d0
[ 2214.971007]  [<c0577951>] SyS_write+0x51/0xb0
[ 2214.971007]  [<c0a07b9f>] sysenter_do_call+0x12/0x12
[ 2214.971007] Code: 00 00 00 55 89 e5 66 66 66 66 90 64 81 05 90 b6 dc c0 00 02 00 00 ba 00 01 00 00 f0 66 0f c1 10 0f b6 ce 38 d1 75 04 5d c3 f3 90 <0f> b6 10 38 ca 75 f7 5d c3 90 90 90 90 90 90 90 90 90 90 90 90

The deadlock occurs because ppp_channel_push() (near the end of the
stack listing) holds the pch->downl spinlock while calling
pch->chan->ops->start_xmit().  The dump shows this call filtering down
through the routing layer and into ppp_push() (near the top of the
listing), which tries to acquire the same spinlock.

It sure looks like a ppp data packet was put into an l2tp wrapper 
and then sent back to the ppp layer for transmission, rather than 
getting sent out through the wlan0 interface.

Unfortunately, I can't work around this problem by reconfiguring the
VPN server -- there's no way to tell it to use a different IP address
for its end of the VPN tunnel.  Furthermore, the server works just fine
with clients running Windows or OS-X.

So it looks like the problem has to be fixed either in the kernel or in 
the way pppd sets up its routing entry.  Can you guys help?

Thanks,

Alan Stern


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Routing BUG with ppp over l2tp
  2014-10-20 16:39 Routing BUG with ppp over l2tp Alan Stern
@ 2014-10-20 17:19 ` James Carlson
  2014-10-20 19:45   ` Alan Stern
  0 siblings, 1 reply; 5+ messages in thread
From: James Carlson @ 2014-10-20 17:19 UTC (permalink / raw)
  To: Alan Stern, James Chapman, Michal Ostrowski; +Cc: linux-ppp, netdev

On 10/20/14 12:39, Alan Stern wrote:
> As far as I can tell, the problem is caused by bad routing.  The kernel
> gets confused because the IP address assigned by the VPN server to the
> server's end of the ppp tunnel is the _same_ as the server's actual IP
> address.

Indeed!  That's pretty darned lame behavior by that peer.  It would
probably be workable if you had a virtual router instance and were able
to put the L2TP connection in one routing instance and the PPP
connection in another routing instance, but that's likely not at all
simple to achieve.

> Unfortunately, I can't work around this problem by reconfiguring the
> VPN server -- there's no way to tell it to use a different IP address
> for its end of the VPN tunnel.  Furthermore, the server works just fine
> with clients running Windows or OS-X.

Really?  That seems ... improbable.

> So it looks like the problem has to be fixed either in the kernel or in 
> the way pppd sets up its routing entry.  Can you guys help?

I think the easiest solution is to configure pppd to lie to the kernel
about the remote address.  Who cares what the remote address is on a
point-to-point link anyway?

There's currently no option to do this, but the code change in ipcp_up()
in pppd/ipcp.c would be rather simple.  Just make the "noremoteip" code
run all the time:

/* Deliberately falsify the remote address.  We don't care. */
ho->hisaddr = htonl(0x0aa00002);

As long as you don't need to contact that specific remote server using
the badly-assigned "internal" VPN address and can live with the fact
that you'll either go through the regular Internet to that address or be
forced to use some other address configured on that server, you should
be good.

(The address I used above is 10.160.0.2.  That was one of the internal
DNS server addresses provided in the log you posted.  It's not necessary
that the address used here is exactly that, but it may well be helpful.)

If you can't do that for some reason, then I suppose it would be
possible to use IP Chains (or whatever the packet-modification tool du
jure is used in your Linux distribution) to nail up an exception so that
the outside packets go to the outside interface and the inside ones go
to the PPP interface.  Doing that likely requires selecting on (at
least!) source address, so it's messy and ugly and possibly error-prone,
but it might be doable.

Otherwise, contact the maintainer of that VPN server.  It's just plain
old broken, and life's too short for broken software.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Routing BUG with ppp over l2tp
  2014-10-20 17:19 ` James Carlson
@ 2014-10-20 19:45   ` Alan Stern
  2014-10-20 20:22     ` James Carlson
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Stern @ 2014-10-20 19:45 UTC (permalink / raw)
  To: James Carlson; +Cc: James Chapman, linux-ppp, netdev

On Mon, 20 Oct 2014, James Carlson wrote:

> On 10/20/14 12:39, Alan Stern wrote:
> > As far as I can tell, the problem is caused by bad routing.  The kernel
> > gets confused because the IP address assigned by the VPN server to the
> > server's end of the ppp tunnel is the _same_ as the server's actual IP
> > address.
> 
> Indeed!  That's pretty darned lame behavior by that peer.  It would
> probably be workable if you had a virtual router instance and were able
> to put the L2TP connection in one routing instance and the PPP
> connection in another routing instance, but that's likely not at all
> simple to achieve.

I'd like to find the simplest solution.  Ideally it should "just work", 
like the Windows and OS-X clients do.

> > Unfortunately, I can't work around this problem by reconfiguring the
> > VPN server -- there's no way to tell it to use a different IP address
> > for its end of the VPN tunnel.  Furthermore, the server works just fine
> > with clients running Windows or OS-X.
> 
> Really?  That seems ... improbable.

I guess that depends on how you judge probabilities.  :-)

As evidence to convince you, here's a log of a session on a rather old
Mac Powerbook G4 running OS 10.4.11.  The situation isn't exactly the
same as with my Linux system, because for this test the client and the
VPN server are on the same subnet -- I don't think that should make any
difference.  The client's IP address is 140.247.233.41, the server's is
.37, and the router to the outside world is .33.  The client's ppp IP
address (assigned by the server) is 10.170.30.1.

The following commands were carried out while the VPN was connected:


------------------------------------------------------------------------
michael-burns-powerbook-g4:~ stern$ netstat -rn -f inet
Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use  Netif Expire
default            140.247.233.37     UGSc        2        4   ppp0
10                 ppp0               USc         1        0   ppp0
127                127.0.0.1          UCS         0        0    lo0
127.0.0.1          127.0.0.1          UH         12     2278    lo0
140.247.233.32/27  link#4             UCS         2        0    en0
140.247.233.33     0:8:e3:ff:fc:b8    UHLW        0        0    en0   1198
140.247.233.37     0:1e:f7:15:53:a8   UHLW        3       10    en0   1153
140.247.233.37/32  link#4             UCS         1        0    en0
140.247.233.41     127.0.0.1          UHS         0        0    lo0
169.254            link#4             UCS         0        0    en0

michael-burns-powerbook-g4:~ stern$ ping -c1 -n 10.160.0.2
PING 10.160.0.2 (10.160.0.2): 56 data bytes
64 bytes from 10.160.0.2: icmp_seq=0 ttl=64 time=1.368 ms

--- 10.160.0.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.368/1.368/1.368/nan ms

michael-burns-powerbook-g4:~ stern$ ifconfig en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet 140.247.233.41 netmask 0xffffffe0 broadcast 140.247.233.63
        ether 00:03:93:12:da:48 
        media: autoselect (100baseTX <full-duplex>) status: active
        supported media: none autoselect 10baseT/UTP <half-duplex> 10baseT/UTP <full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 100baseTX <half-duplex> 100baseTX <full-duplex> 100baseTX <full-duplex,hw-loopback>

michael-burns-powerbook-g4:~ stern$ ifconfig ppp0
ppp0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
        inet 10.170.30.1 --> 140.247.233.37 netmask 0xff000000 
------------------------------------------------------------------------


I don't understand (and can't be bothered to look up) those arcane
symbols in the netstat output.  The IP address used for the ping test
(10.160.0.2) is a system on the VPN's private network.

Here's comparable output for a connection from a computer running 
Windows 7 (same IP addresses as before):


------------------------------------------------------------------------
C:\Users\stern>netstat -rn
===========================================================================
Interface List
 20...........................Rowland VPN
 10...00 1a 6b 57 30 02 ......Intel(R) 82566DM Gigabit Network Connection
  1...........................Software Loopback Interface 1
 11...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter
 12...00 00 00 00 00 00 00 e0 Teredo Tunneling Pseudo-Interface
 19...00 00 00 00 00 00 00 e0 Microsoft 6to4 Adapter
 21...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #2
===========================================================================

IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0   140.247.233.33   140.247.233.41   4491
          0.0.0.0          0.0.0.0         On-link       10.170.30.1     11
      10.170.30.1  255.255.255.255         On-link       10.170.30.1    266
        127.0.0.0        255.0.0.0         On-link         127.0.0.1   4531
        127.0.0.1  255.255.255.255         On-link         127.0.0.1   4531
  127.255.255.255  255.255.255.255         On-link         127.0.0.1   4531
   140.247.233.32  255.255.255.224         On-link    140.247.233.41   4491
   140.247.233.37  255.255.255.255         On-link    140.247.233.41   4236
   140.247.233.41  255.255.255.255         On-link    140.247.233.41   4491
   140.247.233.63  255.255.255.255         On-link    140.247.233.41   4491
        224.0.0.0        240.0.0.0         On-link         127.0.0.1   4531
        224.0.0.0        240.0.0.0         On-link    140.247.233.41   4492
        224.0.0.0        240.0.0.0         On-link       10.170.30.1     11
  255.255.255.255  255.255.255.255         On-link         127.0.0.1   4531
  255.255.255.255  255.255.255.255         On-link    140.247.233.41   4491
  255.255.255.255  255.255.255.255         On-link       10.170.30.1    266
===========================================================================
Persistent Routes:
  Network Address          Netmask  Gateway Address  Metric
          0.0.0.0          0.0.0.0   140.247.233.33  Default
===========================================================================

C:\Users\stern>ping -n 1 10.160.0.2

Pinging 10.160.0.2 with 32 bytes of data:
Reply from 10.160.0.2: bytes=32 time<1ms TTL=64

Ping statistics for 10.160.0.2:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms

C:\Users\stern>ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : Windows-test
   Primary Dns Suffix  . . . . . . . : rowland.org
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : rowland.org

PPP adapter Rowland VPN:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Rowland VPN
   Physical Address. . . . . . . . . :
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   IPv4 Address. . . . . . . . . . . : 10.170.30.1(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.255
   Default Gateway . . . . . . . . . : 0.0.0.0
   DNS Servers . . . . . . . . . . . : 10.160.0.2
                                       10.160.0.3
   Primary WINS Server . . . . . . . : 10.160.0.2
   NetBIOS over Tcpip. . . . . . . . : Disabled

Ethernet adapter Local Area Connection:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Intel(R) 82566DM Gigabit Network Connection
   Physical Address. . . . . . . . . : 00-1A-6B-57-30-02
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::1426:1891:bf83:3982%10(Preferred)
   IPv4 Address. . . . . . . . . . . : 140.247.233.41(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.224
   Default Gateway . . . . . . . . . : 140.247.233.33
   DHCPv6 IAID . . . . . . . . . . . : 234887787
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-1B-A2-3E-B1-00-1A-6B-57-30-02

   DNS Servers . . . . . . . . . . . : 8.8.8.8
   NetBIOS over Tcpip. . . . . . . . : Enabled
------------------------------------------------------------------------


Although the Windows ipconfig output doesn't show the IP address of the
server side of the ppp tunnel, it does show up in a Details window
under the Network control panel, and it is indeed set to
140.247.233.37.

> > So it looks like the problem has to be fixed either in the kernel or in 
> > the way pppd sets up its routing entry.  Can you guys help?
> 
> I think the easiest solution is to configure pppd to lie to the kernel
> about the remote address.  Who cares what the remote address is on a
> point-to-point link anyway?
> 
> There's currently no option to do this, but the code change in ipcp_up()
> in pppd/ipcp.c would be rather simple.  Just make the "noremoteip" code
> run all the time:
> 
> /* Deliberately falsify the remote address.  We don't care. */
> ho->hisaddr = htonl(0x0aa00002);
> 
> As long as you don't need to contact that specific remote server using
> the badly-assigned "internal" VPN address and can live with the fact
> that you'll either go through the regular Internet to that address or be
> forced to use some other address configured on that server, you should
> be good.
> 
> (The address I used above is 10.160.0.2.  That was one of the internal
> DNS server addresses provided in the log you posted.  It's not necessary
> that the address used here is exactly that, but it may well be helpful.)

That might work.  But using a nonstandard version of pppd would be
awkward, and I would prefer to avoid it.

> If you can't do that for some reason, then I suppose it would be
> possible to use IP Chains (or whatever the packet-modification tool du
> jure is used in your Linux distribution) to nail up an exception so that
> the outside packets go to the outside interface and the inside ones go
> to the PPP interface.  Doing that likely requires selecting on (at
> least!) source address, so it's messy and ugly and possibly error-prone,
> but it might be doable.

That sounds like a fairly easy thing to try.  But it would still 
require manual intervention instead of just working.  Fixing the kernel 
would be preferable, IMO.

> Otherwise, contact the maintainer of that VPN server.  It's just plain
> old broken, and life's too short for broken software.

It is an old Cisco security appliance, no doubt well past End-Of-Life.  
I'm starting to think it might be preferable to throw the thing away 
and start up a VPN server on the department's firewall (which is a 
Linux box) instead.

Alan Stern


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Routing BUG with ppp over l2tp
  2014-10-20 19:45   ` Alan Stern
@ 2014-10-20 20:22     ` James Carlson
  2014-10-21 14:15       ` Alan Stern
  0 siblings, 1 reply; 5+ messages in thread
From: James Carlson @ 2014-10-20 20:22 UTC (permalink / raw)
  To: Alan Stern; +Cc: James Chapman, linux-ppp, netdev

On 10/20/14 15:45, Alan Stern wrote:
> On Mon, 20 Oct 2014, James Carlson wrote:
>> Indeed!  That's pretty darned lame behavior by that peer.  It would
>> probably be workable if you had a virtual router instance and were able
>> to put the L2TP connection in one routing instance and the PPP
>> connection in another routing instance, but that's likely not at all
>> simple to achieve.
> 
> I'd like to find the simplest solution.  Ideally it should "just work", 
> like the Windows and OS-X clients do.

I'm not an expert on Windows networking internals.  I assume OS X is BSD
+ whatever the folks in Cupertino have done to it.  :-/

At a guess, it's living on the edge.  It works because the L2TP
connection establishment caches a pointer to the output forwarding table
entry ("route") and just keeps living with it no matter what actually
happens down the line.

On Linux (and likely many other systems), the output computation is a
bit more dynamic, and the establishment of a direct point-to-point link
to a given IP address (as the PPP link represents) causes existing
cached pointers to get flushed away.  Future packets to that destination
(IP _always_ forwards based on destination, not source) go down the most
direct path.  Point-to-point is as direct as you can get.

It may be possible to modify the L2TP code to use flags to avoid the PPP
link (MSG_DONTROUTE?), but I suspect that's probably a bad rather than a
good thing to do.

>>> Unfortunately, I can't work around this problem by reconfiguring the
>>> VPN server -- there's no way to tell it to use a different IP address
>>> for its end of the VPN tunnel.  Furthermore, the server works just fine
>>> with clients running Windows or OS-X.
>>
>> Really?  That seems ... improbable.
> 
> I guess that depends on how you judge probabilities.  :-)

:-/

> Internet:
> Destination        Gateway            Flags    Refs      Use  Netif Expire
> default            140.247.233.37     UGSc        2        4   ppp0
> 10                 ppp0               USc         1        0   ppp0

That's *quite* interesting!  The PPP link doesn't have an interface
route as you'd find on most other systems.  Instead, it has what appears
to be an effectively unnumbered link.  Note the "ppp0" there instead of
an actual output address and the happy use of "10" for the local address
+ mask.

For what it's worth, the forced IP address option I've suggested is
morally equivalent to what's being done here on OS X, so that's a fair
reason to recommend it.

I checked out the pppd on Mac OS X (Darwin 13.4.0; Mavericks), and it
looks to be a variant of the SAMBA/ANU/CMU pppd, but I'm not sure what's
different with it, and I know of no contributions from them.  And the
BSD support is long gone from the main source base ...

> I don't understand (and can't be bothered to look up) those arcane
> symbols in the netstat output.  The IP address used for the ping test
> (10.160.0.2) is a system on the VPN's private network.

The flags aren't all that interesting.  "Up" "Gateway" "Static"
"cloning" are all expected in this context.

>> As long as you don't need to contact that specific remote server using
>> the badly-assigned "internal" VPN address and can live with the fact
>> that you'll either go through the regular Internet to that address or be
>> forced to use some other address configured on that server, you should
>> be good.
>>
>> (The address I used above is 10.160.0.2.  That was one of the internal
>> DNS server addresses provided in the log you posted.  It's not necessary
>> that the address used here is exactly that, but it may well be helpful.)
> 
> That might work.  But using a nonstandard version of pppd would be
> awkward, and I would prefer to avoid it.

What's "non-standard?"

Having the ability to force a given remote IP address looks to me like a
perfectly reasonable thing to do.  We allow the remote IP address to be
set arbitrarily when the peer (for whatever reason) refuses to divulge
its address, and this is just an extension of that idea.

>> If you can't do that for some reason, then I suppose it would be
>> possible to use IP Chains (or whatever the packet-modification tool du
>> jure is used in your Linux distribution) to nail up an exception so that
>> the outside packets go to the outside interface and the inside ones go
>> to the PPP interface.  Doing that likely requires selecting on (at
>> least!) source address, so it's messy and ugly and possibly error-prone,
>> but it might be doable.
> 
> That sounds like a fairly easy thing to try.  But it would still 
> require manual intervention instead of just working.  Fixing the kernel 
> would be preferable, IMO.

I don't quite agree that it's necessarily "broken."

I do agree that it's bad to crash due to this misconfiguration.  That's
certainly a bug of some sort.  But making the kernel "naturally" accept
that the same unicast remote IP address refers to different outputs
depending on phase-of-moon in order to make this weird server happy
sounds like adding a bug rather than fixing one.

Routing based on destination is a good thing.

>> Otherwise, contact the maintainer of that VPN server.  It's just plain
>> old broken, and life's too short for broken software.
> 
> It is an old Cisco security appliance, no doubt well past End-Of-Life.  
> I'm starting to think it might be preferable to throw the thing away 
> and start up a VPN server on the department's firewall (which is a 
> Linux box) instead.

That sounds like a good (and easier to support) solution.

-- 
James Carlson         42.703N 71.076W         <carlsonj@workingcode.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Routing BUG with ppp over l2tp
  2014-10-20 20:22     ` James Carlson
@ 2014-10-21 14:15       ` Alan Stern
  0 siblings, 0 replies; 5+ messages in thread
From: Alan Stern @ 2014-10-21 14:15 UTC (permalink / raw)
  To: James Carlson; +Cc: James Chapman, linux-ppp, netdev

On Mon, 20 Oct 2014, James Carlson wrote:

> >> Otherwise, contact the maintainer of that VPN server.  It's just plain
> >> old broken, and life's too short for broken software.
> > 
> > It is an old Cisco security appliance, no doubt well past End-Of-Life.  
> > I'm starting to think it might be preferable to throw the thing away 
> > and start up a VPN server on the department's firewall (which is a 
> > Linux box) instead.
> 
> That sounds like a good (and easier to support) solution.

Okay.  I looked into iptables, but it doesn't seem to provide any way 
to prevent a packet from being routed through a particular interface. 
:-(

On the other hand, I tried writing a short /etc/ppp/ip-up.local script
that changes the destination address of the ppp interface and adds a
default route to the new, correct address.  It worked!  It's not a
perfect solution, because there's still a short window in which the
interface is up with the wrong address.  A few packets get lost and a
deadlock could occur.  But at least it's simple and non-invasive, and
it definitely proves the address conflict was indeed the cause of the
problem.

Changing pppd would be more foolproof.  But then I'd also have to 
change the programs that call it (xl2tpd and then NetworkManager), and 
doing all that doesn't seem worthwhile.

In the end, I think the best solution will be to replace the VPN 
server.

Thanks for your help,

Alan Stern


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-21 14:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-20 16:39 Routing BUG with ppp over l2tp Alan Stern
2014-10-20 17:19 ` James Carlson
2014-10-20 19:45   ` Alan Stern
2014-10-20 20:22     ` James Carlson
2014-10-21 14:15       ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).