L2TPNS Manual
- Overview
- Installation
- Requirements
- Compile
- Install
- Running
- Configuration
- startup-config
- users
- ip_pool
- build-garden
- Controlling the Process
- Command-Line Interface
- nsctl
- Signals
- Throttling
- Interception
- Authentication
- Plugins
- Walled Garden
- Filtering
- Clustering
- Routing
- Avoiding Fragmentation
- Performance
Overview
l2tpns is half of a complete L2TP implementation. It supports only the
LNS side of the connection.
L2TP (Layer 2 Tunneling Protocol) is designed to allow any layer 2
protocol (e.g. Ethernet, PPP) to be tunneled over an IP connection. l2tpns
implements PPP over L2TP only.
There are a couple of other L2TP implementations, of which l2tpd is probably the
most popular. l2tpd also will handle being either end of a tunnel, and
is a lot more configurable than l2tpns. However, due to the way it works,
it is nowhere near as scalable.
l2tpns uses the TUN/TAP interface provided by the Linux kernel to receive
and send packets. Using some packet manipulation it doesn't require a
single interface per connection, as l2tpd does.
This allows it to scale extremely well to very high loads and very high
numbers of connections.
It also has a plugin architecture which allows custom code to be run
during processing. An example of this is in the walled garden module
included.
Documentation is not my best skill. If you find any problems
with this document, or if you wish to contribute, please email the mailing list.
Installation
Requirements
- Linux kernel version 2.4 or above, with the Tun/Tap interface either
compiled in, or as a module.
- libcli 1.8.0 or greater.
You can get this from http://sourceforge.net/projects/libcli
Compile
You can generally get away with just running make from the source
directory. This will compile the daemon, associated tools and any modules
shipped with the distribution.
Install
After you have successfully compiled everything, run make
install to install it. By default, the binaries are installed into
/usr/sbin, the configuration into /etc/l2tpns, and the
modules into /usr/lib/l2tpns.
You will definately need to edit the configuration files before you
start. See the Configuration section for
more information.
Running
You only need to run /usr/sbin/l2tpns as root to start it. It does
not detach to a daemon process, so you should perhaps run it from init.
By default there is no log destination set, so all log messages will go to
stdout.
Configuration
All configuration of the software is done from the files installed into
/etc/l2tpns.
startup-config
This is the main configuration file for l2tpns. The format of the file is a
list of commands that can be run through the command-line interface. This
file can also be written directly by the l2tpns process if a user runs the
write memory command, so any comments will be lost. However if your
policy is not to write the config by the program, then feel free to comment
the file with a # or ! at the beginning of the line.
A list of the possible configuration directives follows. Each of these
should be set by a line like:
set configstring "value"
set ipaddress 192.168.1.1
set boolean true
- debug (int)
Sets the level of messages that will be written to the log file. The value
should be between 0 and 5, with 0 being no debugging, and 5 being the
highest. A rough description of the levels is:
- Critical Errors - Things are probably broken
- Errors - Things might have gone wrong, but probably will recover
- Warnings - Just in case you care what is not quite perfect
- Information - Parameters of control packets
- Calls - For tracing the execution of the code
- Packets - Everything, including a hex dump of all packets processed... probably twice
Note that the higher you set the debugging level, the slower the program
will run. Also, at level 5 a LOT of information will be logged. This should
only ever be used for working out why it doesn't work at all.
- log_file (string)
This will be where all logging and debugging information is written
to. This may be either a filename, such as /var/log/l2tpns, or
the special magic string syslog:facility, where facility
is any one of the syslog logging facilities, such as local5.
- pid_file (string)
If set, the process id will be written to the specified file. The
value must be an absolute path.
- l2tp_secret (string)
The secret used by l2tpns for authenticating tunnel request. Must be
the same as the LAC, or authentication will fail. Only actually be
used if the LAC requests authentication.
- ppp_restart_time (int)
ppp_max_configure (int)
ppp_max_failure (int)
PPP counter and timer values, as described in §4.1 of
RFC1661.
- primary_dns (ip address)
- secondary_dns (ip address)
Whenever a PPP connection is established, DNS servers will be sent to the
user, both a primary and a secondary. If either is set to 0.0.0.0, then that
one will not be sent.
- primary_radius (ip address)
- secondary_radius (ip address)
Sets the RADIUS servers used for both authentication and accounting.
If the primary server does not respond, then the secondary RADIUS
server will be tried.
Note: in addition to the source IP address and
identifier, the RADIUS server must include the source
port when detecting duplicates to supress (in order to cope with a
large number of sessions comming on-line simultaneously l2tpns uses a
set of udp sockets, each with a seperate identifier).
- primary_radius_port (short)
- secondary_radius_port (short)
Sets the authentication ports for the primary and secondary RADIUS
servers. The accounting port is one more than the authentication
port. If no RADIUS ports are given, the authentication port defaults
to 1645, and the accounting port to 1646.
- radius_accounting (boolean)
If set to true, then RADIUS accounting packets will be sent. This
means that a Start record will be sent when the session is
successfully authenticated, and a Stop record will be sent when the
session is closed.
- radius_secret (string)
This secret will be used in all RADIUS queries. If this is not set then
RADIUS queries will fail.
- radius_authtypes (string)
A comma separated list of supported RADIUS authentication methods
(pap or chap), in order of preference (default pap).
- radius_dae_port (short)
Port for DAE RADIUS (Packet of Death/Disconnect, Change of Authorization)
requests (default: 3799).
- allow_duplicate_users (boolean)
Allow multiple logins with the same username. If false (the default),
any prior session with the same username will be dropped when a new
session is established.
- bind_address (ip address)
When the tun interface is created, it is assigned the address
specified here. If no address is given, 1.1.1.1 is used. Packets
containing user traffic should be routed via this address if given,
otherwise the primary address of the machine.
- peer_address (ip address)
Address to send to clients as the default gateway.
- send_garp (boolean)
Determines whether or not to send a gratuitous ARP for the
bind_address when the server is ready to handle traffic (default:
true).
This value is ignored if BGP is configured.
- throttle_speed (int)
Sets the default speed (in kbits/s) which sessions will be limited to.
If this is set to 0, then throttling will not be used at all. Note:
You can set this by the CLI, but changes will not affect currently
connected users.
- throttle_buckets (int)
Number of token buckets to allocate for throttling. Each throttled
session requires two buckets (in and out).
- accounting_dir (string)
If set to a directory, then every 5 minutes the current usage for
every connected use will be dumped to a file in this directory. Each
file dumped begins with a header, where each line is prefixed by #.
Following the header is a single line for every connected user, fields
separated by a space.
The fields are username, ip, qos,
uptxoctets, downrxoctets. The qos field is 1 if a standard user, and
2 if the user is throttled.
- setuid (int)
After starting up and binding the interface, change UID to this. This
doesn't work properly.
- dump_speed (boolean)
If set to true, then the current bandwidth utilization will be logged every
second. Even if this is disabled, you can see this information by running
the uptime command on the CLI.
- multi_read_count (int)
Number of packets to read off each of the UDP and TUN fds when
returned as readable by select (default: 10). Avoids incurring the
unnecessary system call overhead of select on busy servers.
- scheduler_fifo (boolean)
Sets the scheduling policy for the l2tpns process to SCHED_FIFO. This
causes the kernel to immediately preempt any currently running SCHED_OTHER
(normal) process in favour of l2tpns when it becomes runnable.
Ignored on uniprocessor systems.
- lock_pages (boolean)
Keep all pages mapped by the l2tpns process in memory.
- icmp_rate (int)
Maximum number of host unreachable ICMP packets to send per second.
- packet_limit (int>
Maximum number of packets of downstream traffic to be handled each
tenth of a second per session. If zero, no limit is applied (default:
0). Intended as a DoS prevention mechanism and not a general
throttling control (packets are dropped, not queued).
- cluster_address (ip address)
Multicast cluster address (default: 239.192.13.13). See the section
on Clustering for more information.
- cluster_interface (string)
Interface for cluster packets (default: eth0).
- cluster_hb_interval (int)
Interval in tenths of a second between cluster heartbeat/pings.
- cluster_hb_timeout (int)
Cluster heartbeat timeout in tenths of a second. A new master will be
elected when this interval has been passed without seeing a heartbeat
from the master.
- cluster_master_min_adv (int)
Determines the minumum number of up to date slaves required before the
master will drop routes (default: 1).
BGP routing configuration is entered by the command:
The routing configuration section is entered by the command
- router bgp as
where as specifies the local AS number.
Subsequent lines prefixed with
- neighbour peer
define the attributes of BGP neighhbours. Valid commands are:
- neighbour peer remote-as as
- neighbout peer timers keepalive hold
Where peer specifies the BGP neighbour as either a hostname or
IP address, as is the remote AS number and keepalive,
hold are the timer values in seconds.
Named access-lists are configured using one of the commands:
- ip access-list standard name
- ip access-list extended name
Subsequent lines prefixed with permit or deny
define the body of the access-list. Standard access-list syntax:
- {permit|deny}
{host|source source-wildcard|any}
[{host|destination destination-wildcard|any}]
Extended access-lists:
{permit|deny} ip
{host|source source-wildcard|any}
{host|destination destination-wildcard|any} [fragments]
{permit|deny} udp
{host|source source-wildcard|any}
[{eq|neq|gt|lt} port|range from to]
{host|destination destination-wildcard|any}
[{eq|neq|gt|lt} port|range from to]
[fragments]
{permit|deny} tcp
{host|source source-wildcard|any}
[{eq|neq|gt|lt} port|range from to]
{host|destination destination-wildcard|any}
[{eq|neq|gt|lt} port|range from to]
[{established|{match-any|match-all}
{+|-}{fin|syn|rst|psh|ack|urg}
...|fragments]
users
Usernames and passwords for the command-line interface are stored in
this file. The format is username:password where
password may either by plain text, an MD5 digest (prefixed by
$1salt$) or a DES password, distinguished from
plain text by the prefix {crypt}.
The username enable has a special meaning and is used to set
the enable password.
Note: If this file doesn't exist, then anyone who can get to
port 23 will be allowed access without a username / password.
ip_pool
This file is used to configure the IP address pool which user
addresses are assigned from. This file should contain either an IP
address or a CIDR network per line. e.g.:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.4.0/24
172.16.0.0/16
10.0.0.0/8
Keep in mind that l2tpns can only handle 65535 connections per
process, so don't put more than 65535 IP addresses in the
configuration file. They will be wasted.
build-garden
The garden plugin on startup creates a NAT table called "garden" then
sources the build-garden script to populate that table. All
packets from gardened users will be sent through this table. Example:
iptables -t nat -A garden -p tcp -m tcp --dport 25 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p udp -m udp --dport 53 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p tcp -m tcp --dport 53 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p tcp -m tcp --dport 80 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p tcp -m tcp --dport 110 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p tcp -m tcp --dport 443 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p icmp -m icmp --icmp-type echo-request -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p icmp -j ACCEPT
iptables -t nat -A garden -j DROP
Controlling the Process
A running l2tpns process can be controlled in a number of ways. The primary
method of control is by the Command-Line Interface (CLI).
You can also remotely send commands to modules via the nsctl client
provided.
Also, there are a number of signals that l2tpns understands and takes action
when it receives them.
Command-Line Interface
You can access the command line interface by telnet'ing to port 23.
There is no IP address restriction, so it's a good idea to firewall
this port off from anyone who doesn't need access to it. See
users for information on restricting access based
on a username and password.
The CLI gives you real-time control over almost everything in
the process. The interface is designed to look like a Cisco
device, and supports things like command history, line editing and
context sensitive help. This is provided by linking with the
libcli
library. Some general documentation of the interface is
here.
After you have connected to the telnet port (and perhaps logged in), you
will be presented with a hostname> prompt.
Enter help to get a list of possible commands. A brief
overview of the more important commands follows:
- show session
Without specifying a session ID, this will list all tunnels currently
connected. If you specify a session ID, you will be given all
information on a single tunnel. Note that the full session list can
be around 185 columns wide, so you should probably use a wide terminal
to see the list properly.
The columns listed in the overview are:
SID | Session ID |
TID | Tunnel ID - Use with show tunnel tid |
Username | The username given in the PPP
authentication. If this is *, then LCP authentication has not
completed. |
IP | The IP address given to the session. If
this is 0.0.0.0, LCP negotiation has not completed. |
I | Intercept - Y or N depending on whether the
session is being snooped. See snoop. |
T | Throttled - Y or N if the session is
currently throttled. See throttle. |
G | Walled Garden - Y or N if the user is
trapped in the walled garden. This field is present even if the
garden module is not loaded. |
opened | The number of seconds since the
session started |
downloaded | Number of bytes downloaded by the user |
uploaded | Number of bytes uploaded by the user |
idle | The number of seconds since traffic was
detected on the session |
LAC | The IP address of the LAC the session is
connected to. |
CLI | The Calling-Line-Identification field
provided during the session setup. This field is generated by the
LAC. |
- show users
With no arguments, display a list of currently connected users. If an
argument is given, the session details for the given username are
displayed.
- show tunnel
This will show all the open tunnels in a summary, or detail on a single
tunnel if you give a tunnel id.
The columns listed in the overview are:
TID | Tunnel ID |
Hostname | The hostname for the tunnel as
provided by the LAC. This has no relation to DNS, it is just
a text field. |
IP | The IP address of the LAC |
State | Tunnel state - Free, Open, Dieing,
Opening |
Sessions | The number of open sessions on the
tunnel |
- show pool
Displays the current IP address pool allocation. This will only display
addresses that are in use, or are reserved for re-allocation to a
disconnected user.
If an address is not currently in use, but has been used, then in the User
column the username will be shown in square brackets, followed by the time
since the address was used:
IP Address Used Session User
192.168.100.6 N [joe.user] 1548s
- show radius
Show a summary of the in-use RADIUS sessions. This list should not be very
long, as RADIUS sessions should be cleaned up as soon as they are used. The
columns listed are:
Radius | The ID of the RADIUS request. This is
sent in the packet to the RADIUS server for identification. |
State | The state of the request - WAIT, CHAP,
AUTH, IPCP, START, STOP, NULL. |
Session | The session ID that this RADIUS
request is associated with |
Retry | If a response does not appear to the
request, it will retry at this time. This is a unix timestamp. |
Try | Retry count. The RADIUS request is
discarded after 3 retries. |
- show running-config
This will list the current running configuration. This is in a format that
can either be pasted into the configuration file, or run directly at the
command line.
- show counters
Internally, counters are kept of key values, such as bytes and packets
transferred, as well as function call counters. This function displays all
these counters, and is probably only useful for debugging.
You can reset these counters by running clear counters.
- show cluster
Show cluster status. Shows the cluster state for this server
(Master/Slave), information about known peers and (for slaves) the
master IP address, last packet seen and up-to-date status.
See Clustering for more information.
- write memory
This will write the current running configuration to the config file
startup-config, which will be run on a restart.
- snoop
You must specify a username, IP address and port. All packets for the
current session for that username will be forwarded to the given
host/port. Specify no snoop username to disable interception
for the session.
If you want interception to be permanent, you will have to modify the RADIUS
response for the user. See Interception.
- throttle
You must specify a username, which will be throttled for the current
session. Specify no throttle username to disable throttling
for the current session.
If you want throttling to be permanent, you will have to modify the
RADIUS response for the user. See Throttling.
- drop session
This will cleanly disconnect a session. You must specify a session id, which
you can get from show session. This will send a disconnect message
to the remote end.
- drop tunnel
This will cleanly disconnect a tunnel, as well as all sessions on that
tunnel. It will send a disconnect message for each session individually, and
after 10 seconds it will send a tunnel disconnect message.
- uptime
This will show how long the l2tpns process has been running, and the current
bandwidth utilization:
17:10:35 up 8 days, 2212 users, load average: 0.21, 0.17, 0.16
Bandwidth: UDP-ETH:6/6 ETH-UDP:13/13 TOTAL:37.6 IN:3033 OUT:2569
The bandwidth line contains 4 sets of values.
UDP-ETH is the current bandwidth going from the LAC to the ethernet
(user uploads), in mbits/sec.
ETH-UDP is the current bandwidth going from ethernet to the LAC (user
downloads).
TOTAL is the total aggregate bandwidth in mbits/s.
IN and OUT are packets/per-second going between UDP-ETH and ETH-UDP.
These counters are updated every second.
- configure terminal
Enter configuration mode. Use exit or ^Z to exit this mode.
The following commands are valid in this mode:
- load plugin
Load a plugin. You must specify the plugin name, and it will search in
/usr/lib/l2tpns for plugin.so. You can unload a loaded plugin with
remove plugin.
- set
Set a configuration variable. You must specify the variable name, and
the value. If the value contains any spaces, you should quote the
value with double (") or single (') quotes.
You can set any startup-config value in
this way, although some may require a restart to take effect.
nsctl
nsctl allows messages to be passed to plugins.
Arguments are command and optional args. See
nsctl(8) for more details.
Built-in command are load_plugin, unload_plugin and
help. Any other commands are passed to plugins for processing.
Signals
While the process is running, you can send it a few different signals, using
the kill command.
killall -HUP l2tpns
The signals understood are:
- SIGHUP
- Reload the config from disk and re-open log file.
- SIGTERM, SIGINT
- Stop process. Tunnels and sessions are not
terminated. This signal should be used to stop l2tpns on a
cluster node where there are other machines to
continue handling traffic.
- SIGQUIT
- Shut down tunnels and sessions, exit process when
complete.
Throttling
l2tpns contains support for slowing down user sessions to whatever speed you
desire. You must first enable the global setting throttle_speed
before this will be activated.
If you wish a session to be throttled permanently, you should set the
Vendor-Specific RADIUS value Cisco-Avpair="throttle=yes", which
will be handled by the autothrottle module.
Otherwise, you can enable and disable throttling an active session using
the throttle CLI command.
Interception
You may have to deal with legal requirements to be able to intercept a
user's traffic at any time. l2tpns allows you to begin and end interception
on the fly, as well as at authentication time.
When a user is being intercepted, a copy of every packet they send and
receive will be sent wrapped in a UDP packet to the IP address and port set
in the snoop_host and snoop_port configuration
variables.
The UDP packet contains just the raw IP frame, with no extra headers.
To enable interception on a connected user, use the snoop username
and no snoop username CLI commands. These will enable interception
immediately.
If you wish the user to be intercepted whenever they reconnect, you will
need to modify the RADIUS response to include the Vendor-Specific value
Cisco-Avpair="intercept=yes". For this feature to be enabled,
you need to have the autosnoop module loaded.
Authentication
Whenever a session connects, it is not fully set up until authentication is
completed. The remote end must send a PPP CHAP or PPP PAP authentication
request to l2tpns.
This request is sent to the RADIUS server, which will hopefully respond with
Auth-Accept or Auth-Reject.
If Auth-Accept is received, the session is set up and an IP address is
assigned. The RADIUS server can include a Framed-IP-Address field in the
reply, and that address will be assigned to the client. It can also include
specific DNS servers, and a Framed-Route if that is required.
If Auth-Reject is received, then the client is sent a PPP AUTHNAK packet,
at which point they should disconnect. The exception to this is when the
walled garden module is loaded, in which case the user still receives the
PPP AUTHACK, but their session is flagged as being a garden'd user, and they
should not receive any service.
The RADIUS reply can also contain a Vendor-Specific attribute called
Cisco-Avpair. This field is a freeform text field that most Cisco
devices understand to contain configuration instructions for the session. In
the case of l2tpns it is expected to be of the form
key=value,key2=value2,key3=value3,keyn=value
Each key-value pair is separated and passed to any modules loaded. The
autosnoop and autothrottle understand the keys
intercept and throttle respectively. For example, to have
a user who is to be throttled and intercepted, the Cisco-Avpair value should
contain:
intercept=yes,throttle=yes
Plugins
So as to make l2tpns as flexible as possible (I know the core code is pretty
difficult to understand), it includes a plugin API, which you can use to
hook into certain events.
There are a few example modules included - autosnoop, autothrottle and
garden.
When an event happens that has a hook, l2tpns looks for a predefined
function name in every loaded module, and runs them in the order the modules
were loaded.
The function should return PLUGIN_RET_OK if it is all OK. If it returns
PLUGIN_RET_STOP, then it is assumed to have worked, but that no further
modules should be run for this event.
A return of PLUGIN_RET_ERROR means that this module failed, and
no further processing should be done for this event. Use this with care.
Every event function called takes a specific structure named
param_event, which varies in content with each event. The
function name for each event will be plugin_event,
so for the event timer, the function declaration should look like:
int plugin_timer(struct param_timer *data);
A list of the available events follows, with a list of all the fields in the
supplied structure:
Event | Description | Parameters |
pre_auth |
This is called after a RADIUS response has been
received, but before it has been processed by the
code. This will allow you to modify the response in
some way.
|
- t
- Tunnel
- s
- Session
- username
- password
- protocol
- 0xC023 for PAP, 0xC223 for CHAP
- continue_auth
- Set to 0 to stop processing authentication modules
|
post_auth |
This is called after a RADIUS response has been
received, and the basic checks have been performed. This
is what the garden module uses to force authentication
to be accepted.
|
- t
- Tunnel
- s
- Session
- username
- auth_allowed
- This is already set to true or
false depending on whether authentication has been
allowed so far. You can set this to 1 or 0 to force
allow or disallow authentication
- protocol
- 0xC023 for PAP, 0xC223 for CHAP
|
packet_rx |
This is called whenever a session receives a
packet. Use this sparingly, as this will
seriously slow down the system.
|
- t
- Tunnel
- s
- Session
- buf
- The raw packet data
- len
- The length of buf
|
packet_tx |
This is called whenever a session sends a
packet. Use this sparingly, as this will
seriously slow down the system.
|
- t
- Tunnel
- s
- Session
- buf
- The raw packet data
- len
- The length of buf
|
timer |
This is run every second, no matter what is happening.
This is called from a signal handler, so make sure anything
you do is reentrant.
|
- time_now
- The current unix timestamp
|
new_session |
This is called after a session is fully set up. The
session is now ready to handle traffic.
|
- t
- Tunnel
- s
- Session
|
kill_session |
This is called when a session is about to be shut down.
This may be called multiple times for the same session.
|
- t
- Tunnel
- s
- Session
|
radius_response |
This is called whenever a RADIUS response includes a
Cisco-Avpair value. The value is split up into
key=value pairs, and each is processed through all
modules.
|
- t
- Tunnel
- s
- Session
- key
- value
|
radius_reset |
This is called whenever a RADIUS CoA request is
received to reset any options to default values before
the new values are applied.
|
- t
- Tunnel
- s
- Session
|
control |
This is called in whenever a nsctl packet is received.
This should handle the packet and form a response if
required.
|
- iam_master
- Cluster master status
- argc
- The number of arguments
- argv
- Arguments
- response
- Return value: NSCTL_RES_OK or NSCTL_RES_ERR
- additional
- Extended response text
|
|
Walled Garden
Walled Garden is implemented so that you can provide perhaps limited service
to sessions that incorrectly authenticate.
Whenever a session provides incorrect authentication, and the
RADIUS server responds with Auth-Reject, the walled garden module
(if loaded) will force authentication to succeed, but set the flag
garden in the session structure, and adds an iptables rule to
the garden_users chain to force all packets for the session's IP
address to traverse the garden chain.
This doesn't just work. To set this all up, you will to
setup the garden nat table with the
build-garden script with rules to limit
user's traffic. For example, to force all traffic except DNS to be
forwarded to 192.168.1.1, add these entries to your
build-garden:
iptables -t nat -A garden -p tcp --dport ! 53 -j DNAT --to 192.168.1.1
iptables -t nat -A garden -p udp --dport ! 53 -j DNAT --to 192.168.1.1
l2tpns will add entries to the garden_users chain as appropriate.
You can check the amount of traffic being captured using the following
command:
iptables -t nat -L garden -nvx
Filtering
Sessions may be filtered by specifying Filter-Id attributes in
the RADIUS reply. filter.in specifies that the named
access-list filter should be applied to traffic from the
customer, filter.out specifies a list for traffic to the
customer.
Clustering
An l2tpns cluster consists of of one* or more servers configured with
the same configuration, notably the multicast cluster_address.
*A stand-alone server is simply a degraded cluster.
Initially servers come up as cluster slaves, and periodically (every
cluster_hb_interval/10 seconds) send out ping packets
containing the start time of the process to the multicast
cluster_address.
A cluster master sends heartbeat rather than ping packets, which
contain those session and tunnel changes since the last heartbeat.
When a slave has not seen a heartbeat within
cluster_hb_timeout/10 seconds it "elects" a new master by
examining the list of peers it has seen pings from and determines
which of these and itself is the "best" candidate to be master.
"Best" in this context means the server with the highest uptime (the
highest IP address is used as a tie-breaker in the case of equal
uptimes).
After discovering a master, and determining that it is up-to-date (has
seen an update for all in-use sessions and tunnels from heartbeat
packets) will raise a route (see Routing) for
the bind_address and for all addresses/networks in
ip_pool. Any packets recieved by the slave which would alter
the session state, as well as packets for throttled or gardened
sessions are forwarded to the master for handling. In addition, byte
counters for session traffic are periodically forwarded.
A master, when determining that it has at least one up-to-date slave
will drop all routes (raising them again if all slaves disappear) and
subsequently handle only packets forwarded to it by the slaves.
Routing
If you are running a single instance, you may simply statically route
the IP pools to the bind_address (l2tpns will send a gratuitous
arp).
For a cluster, configure the members as BGP neighbours on your router
and configure multi-path load-balancing. Cisco uses "maximum-paths
ibgp" for IBGP. If this is not supported by your IOS revision, you
can use "maximum-paths" (which works for EBGP) and set
as_number to a private value such as 64512.
Avoiding Fragmentation
Fragmentation of encapsulated return packets to the LAC may be avoided
for TCP sessions by adding a firewall rule to clamps the MSS on
outgoing SYN packets.
The following is appropriate for interfaces with a typical MTU of
1500:
iptables -A FORWARD -i tun+ -o eth0 \
-p tcp --tcp-flags SYN,RST SYN \
-m tcpmss --mss 1413:1600 \
-j TCPMSS --set-mss 1412
Performance is great.
I'd like to include some pretty graphs here that show a linear performance
increase, with no impact by number of connected sessions.
That's really what it looks like.
David Parrish
l2tpns-users@lists.sourceforge.net