Skip to content

Edge Router & BNG Optimisation Guide for ISPs

Last updated on 5 October 2021

This guide will be based on MikroTik RouterOS syntax, but it should not be too hard to replicate the same config on other platforms. I will update the contents as required outside the scope of the initial deployment/testing.

APNIC Version (will be less frequently updated than this)

Update frequency to the content is proportional to the ASs that are willing to cooperate with me and play around with the config such as IPv6/BGP optimisation etc. At the moment this is strictly IPv4. I do have the IPv6 config ready on paper, but I need to test it out in real-time. If you’re an AS operator and you’re interested in working something out you can always reach out to me.

Configuration was tested and deployed on AS135756 with my good friend Mr Varun Singhania (proprietor of the ASN).

Further configuration was extensively tested on AS132559 as I was a downstream customer and was able to test the impacts/config changes directly as an end-user as of 22nd July 2021.

Generally speaking, a lot of ISPs in Asia use MikroTik RouterOS to provide access to their customers via PPPoE (please get on board with DHCP!) and some ISPs use MikroTik for their edge/core routers as well.

We will walk through some of the issues I have come across and the solutions for them.

Keep in mind MikroTik uses RouterOS, which is based on the Linux Kernel. However, RouterOS v6 stable/LTS runs on an ancient version of the Linux kernel (Hopefully, they implement newer frameworks for their RouterOS v7) and hence uses the legacy iptables for packet filtering. So, if you want to thoroughly understand the logic flow behind these suggestions/rules, I’d suggest going through Linux Kernel documentation on the web.

I will assume the reader has some basic knowledge of the terminologies and technologies/protocols used in typical BNG/CGNAT configuration and hence I will not elaborate on the definition and on what a protocol or technology does or does not do. This guide is meant for engineers/ISPs and not home users.

From here onwards Core/Edge are synonymous, BNG/Access Router are synonymous for the sake of simplicity.

Let’s get started

General Configuration Changes

I was surprised to find a few (which should ideally be zero) networks not implementing basic security features on their routers such as not using Telnet in this day and age, exposing Neighbour Discovery etc, running on outdated RouterOS and outdated firmware etc.

  • Upgrade RouterOS to latest long-term release
    • Upgrade the firmware as well after the above
  • MikroTik already has a guide for basic security measures here, I strongly recommend the reader makes use of the measures
  • Make use of Reverse Path Filtering (Slightly different from Unicast Reverse Path Forwarding)
    • It is found in IP>Settings>rp-filter
      • Always use loose mode on edge router (ASN/BGP Sessions)
        • The same on wherever asymmetric or policy routing takes place
      • Use strict mode on BNG and/or wherever symmetric routing take place
  • Remember to make use of interface lists inside Interfaces>Interface List on all routers
    • WAN for all public interface/internet bound
    • LAN for all local interfaces/customer-facing side
      • Remember to include dynamic interfaces for LAN on BNG to account for all PPPoE users
      • Also include VLANs, Bonding interfaces etc
Figure-1 (LAN Include Dynamic)
  • Connection Tracking
    • Disable it on the Edge Router
      • /ip firewall connection tracking set enabled=no
    • Enable loose TCP tracking on all routers including BNG
      • /ip firewall connection tracking loose-tcp-tracking=yes
    • Use the following connection tracking timeout values on all routers
      • The reason behind this: We saw real-time improvements to stability and performance especially for UDP traffic such as VoIP, VoWiFi, gaming, P2P UDP NAT punching etc
      • Upgrade the RAM if you can’t accommodate these values!
Figure-2 (Recommended Connection Tracking Timeout Values)
  • My personal favourites
    • Give the router some accurate system clock
      • /system ntp client set enabled=yes server-dns-names=time.cloudflare.com

MTU

I had to create this section as I have seen way too many ASNs with horrible MTU config.

First, you need to fix MTU across the entire path of your network devices before deploying RFC4638/TCP MSS Clamping algorithm, and of course to allow PMTUD to work correctly. Otherwise, it would either fail or break traffic.

Issues

  • Wrong MTU configuration across the entire path of network devices
    • Switches, routers, hypervisors etc
  • Breaks PMTUD and creates a bunch of problems right at Layer 2/3 on the local network
    • Internet bound traffic would likely fail

Solutions

  • Layer 2 MTU
    • Set it to maximum possible on all ethernet interfaces
      • Such as routers, switches, hypervisors, virtualised instances etc
        • Example: Edge (L2 MTU 10k)>BNG (L2 MTU 10k)>Switch (L2 MTU 10k)>Wireless AP (L2 MTU 2k)>Customer
        • Example: Edge (L2 MTU 10k)>BNG (L2 MTU 10k)>Switch (L2 MTU 10k)>OLT (L2 MTU 10k)>Customer
    • It can vary from vendor/model, but that’s okay
    • By using maximum possible L2 MTU everywhere, we are pumping jumbo frames and eliminating any chances of fragmentation or performance issues down the road for encapsulation
  • Layer 3 MTU
    • Set it to 1600 on all LAN (local) interfaces
      • This applies to L3 VLAN MTU, Bonding interfaces etc
        • If using Stacked VLANs, both S and C VLAN should have equal L3 MTU of 1600
        • Or any other encapsulation protocol that allows the use of L3 MTU on top of the Layer 2 interface such as VPLS
      • To allow for proper utilisation of encapsulation protocols’ overhead and allow RFC4638
    • Set it to 1500 on all WAN (public) interfaces
      • Obviously, we cannot pump jumbo frames to the internet
Figure-3 [Ethernet MTU (Jumbo Frames on L2, L3 = 1500 for WAN and 1600 for LAN)]
Figure-4 [L3 MTU for Bonding interfaces (L3 = 1500 for WAN and 1600 for LAN)]
Figure-5 (VLAN L3 MTU = 1600)
Figure-6 [QinQ (Stacked VLANs) L3 MTU = 1600 on both]

However, keep in mind, some vendors just have “MTU” per interface and this is Layer 3/IP MTU, where the actual Layer 2 MTU would most probably be auto-adjusted by the firmware to accommodate whatever value you set on the L3 MTU. Regardless, whatever is shown above on RouterOS is applicable to any other vendors as well.

For BNG

QoS/Queuing Mechanism

There have been decades-long debates on which algorithm to use, which method to implement the best possible QoS mechanism.

In my testing, I observed the following:

  • Capping on a per-customer basis using single simple queue worked best
  • As for the algorithm of choice
    • I pick SFQ due to the observed low jitter/bufferbloat phenomenon on the customer side
  • Bufferbloat tested using this tool
    • Keep in mind, high bufferbloat = bad, low bufferbloat = good

I have not included a screenshot for every algorithm as that’s unnecessary, but the test scenario was simple, SFQ compared to the rest of the algorithms, the result was SFQ gave the best possible bufferbloat score in my testing.

Figure-7 (Simple Queue + PFIFO resulted in high bufferbloat)
Figure-8 (Simple Queue + SFQ resulted in low bufferbloat)

PPPoE

Issues

  • Packet fragmentation due to non-standard 1500 MTU/MRU
    • Typically, ISPs use 1492 or 1480 or some other strange MTU size
    • Both BNG device and customer router need to make use of hacks like TCP MSS Clamping to work around this
    • PMTUD is simply unreliable as per RFC 8900
      • Gets worse with CGNAT because remote end-points cannot determine MTU of your PPPoE customer behind it
  • Lack of proper routing for PPPoE Clients (Interfaces or Inter-VLANs)
    • Most assume that using single profile for different PPPoE Servers running no different interfaces will work fine

Solutions

  • Deploy RFC 4638
    • Keep in mind that in a network, MTU affects the whole path of L2/L3 devices whether physical or virtual, as long as you the MTU section above, you should be good
    • Simply set MTU and MRU to 1500 inside PPPoE Server on the BNG
Figure-9 (PPPoE Server MTU/MRU & TCP MSS Clamping config)
  • Disable (and delete!) TCP MSS Clamping rules inside IP>Firewall>Mangle
    • Why set some arbitrary value when you can let the engine determine automatically to ensure optimal performance?
      • MikroTik has long since allowed automatic TCP MSS Clamping
      • Make use of PPP>Profile>Default* to enable TCP MSS Clamping directly on the PPPoE engine. This will do the work for any customer whose MTU/MRU is less than 1500.
    • On the customer side, not all routers can take advantage of RFC4638, such as TP-Link, Tenda etc. For them, MTU will remain capped at 1492.
      • The 1492 limitation on their end won’t cause issues with packet fragmentation as packets would fragment at source (their routers) before it exits the interface and hits the BNG and TCP Clamping on PPPoE engine takes care of anything coming in from the outside world toward the customer
      • I have observed 1500 MRU when pinging from the outside world. Suggesting some of these consumer routers support 1500 MRU
      • If they are using MikroTik, pfSense, VyOS etc, they can take advantage of RFC4638 aka 1500 MTU/MRU for their PPPoE Client
      • Some ONT/ONU devices have strange behaviour for MTU negotiation only a few brands like GX, TP-Link, Huawei have been found to be flawless

Extra Note on PPPoE

  • Create a single CGNAT pool on per BNG basis and you can use it for n Number of PPPoE Servers on n number of interfaces
    /ip pool
    add name=CGNAT_Pool comment="100.64.0.0-9 is reserved for each PPPoE Server Gateway/Profile" ranges=100.64.0.10-100.127.255.255
    • Here we are reserving 100.64.0.0-9 for gateway IPs on per interface/PPPoE server basis
  • Local Address in PPP Profile = Gateway IP address
    • One common mistake is using the router’s public IP from WAN interface as the local address, which I’ve seen could lead to issues like traceroute failures or some strange packet loss, you should be using an address that does not exist in IP>Address
    • Each PPPoE Server needs unique profile/gateway in order to allow inter-VLAN communication between CPEs (which is need to allow two customers behind a NATted IP to play a P2P Xbox game with each other on different VLANs) and will also ensure a clean network approach
      • If you have 100 PPPoE Servers, there should be 100 unique PPP Profiles with unique local address for each
    • Something like this for two servers:
      /ppp profile
      add change-tcp-mss=yes local-address=100.64.0.1 name=profile1 remote-address=CGNAT_Pool use-upnp=no
      add change-tcp-mss=yes local-address=100.64.0.2 name=profile2 remote-address=CGNAT_Pool use-upnp=no

      /interface pppoe-server server
      add authentication=pap default-profile=profile1 interface=vlan20 keepalive-timeout=disabled max-mru=1500 max-mtu=1500 one-session-per-host=yes service-name=server1

      add authentication=pap default-profile=profile2 disabled=no interface=vlan21 keepalive-timeout=disabled max-mru=1500 max-mtu=1500 one-session-per-host=yes service-name=server2

CGNAT

Issues

  • The majority of ISPs are using RFC1918 subnets for CGNAT and can clash with subnets on the customer site
  • Breaks P2P traffic
  • Kills the end-to-end principle
  • Requires proper NAT traversal for various protocols including IPsec
  • Routing Loops will occur for any traffic coming from the outside destined towards the public IP pools

Solutions

  • Make use of the 100.64.0.0/10 subnet as it’s meant for CGNAT usage to prevent clashing on the customer site
  • Enable the NAT traversal Helpers on the Router like the following inside IP>Firewall>Service Ports
Figure-10 (NAT Traversal Helpers on RouterOS)
  • Use a simple netmap rule with IPsec passthrough (will allow customers to initiate IPsec out-bound without issues) configured.
  • Use a single NAT rule for all CGNAT customers on a per BNG basis to reduce CPU usage.
    • /ip firewall nat add
      action=netmap chain=srcnat comment="CGNAT rule" dst-address-list=!cgnat_subnets ipsec-policy=out,none out-interface-list=WAN src-address-list=cgnat_subnets
      to-addresses=public/25
      • Here cgnat_subnets=address list containing CGNAT subnets aka 100.64.0.0/10
      • dst-address-list=!cgnat_subnets is self-explanatory, anything destined towards CGNAT subnets shouldn’t be NATted
        • Customers should be able to talk to each other using their CGNAT IP, Xbox makes use of this and is mention in RFC 7021. This is equivalent to old school days of everyone having a public IP and hence is reachable
    • Enable port forwarding for entire ranges (netmap algorithm + state tracking will handle what gets map where)
      • /ip firewall nat
        add action=netmap chain=dstnat comment="Port Forwarding Solution for CGNAT (TCP)" dst-address=public/25 dst-port=1024-65535 protocol=tcp to-addresses=100.64.0.0/10


        add action=netmap chain=dstnat comment="Port Forwarding Solution for CGNAT (UDP)" dst-address=public/25 dst-port=1024-65535 protocol=udp to-addresses=100.64.0.0/10

Below is what MikroTik support had to say about my port forwarding rules

Figure-11 (MikroTik support suggests my port forwarding rules are correct)
  • Avoid Deterministic NAT, the above configuration allows P2P traffic initiated from the inside to be reachable from the outside with various applications that make use of ephemeral ports/UDP NAT punching/STUN etc
  • We were able to successfully seed the official Ubuntu Torrent behind the CGNAT with the above configuration, which can mean only one thing: P2P networking from in-bound established works!
Figure-12 (BitTorrent Seeding Behind CGNAT)
  • We tried with src nat as action for src NAT chain but it resulted in the NATted public IP constantly changing on the customer side and breaking things

Below is what MikroTik support had to say about netmap vs src nat as action for src nat chain

Figure-13 (Src nat = breaks P2P traffic | Netmap = static mapping per client IP)
  • Now we fix routing loops
    • We will use DST NAT to account for remaining traffic such as ICMP and NAT it to a loopback interface
      • Remember to add the bridge to LAN interface list & add the /31 to lan_subnets address list as well
      • /interface bridge
        add arp=disabled comment="For Static Loop Protection" mtu=1500 name=loopback_1 protocol-mode=none


        /ip address
        add address=192.168.0.1/31 comment="For Static Loop Protection" interface=loopback_1 network=192.168.0.0


        /ip firewall nat
        add action=dst-nat chain=dstnat comment="Static Loop Protection" dst-address=public/25 to-addresses=192.168.0.1

End Result

Figure-14 (Your NAT Table should look as dead simple as this one)

IPv6

Issues

  • Addressing may not be optimally subnetted/broken down
  • ISP may only have something like a single /48 with 5000 customers downstream which exceeds possible /56s out of the /48
  • Not following the proper guidelines for IPv6 deployment
  • Lack of persistent assignment feature on MikroTik
    • This applies to the majority of ISPs even though they may use Cisco, Juniper etc which supports persistent assignment configuration
  • Not properly ensuring that customer’s WAN side gets a proper single /64
  • Forcing the customer to have only a single /64 on LAN side instead of /56
  • MikroTik IPv6 RADIUS does not work correctly

Solutions

  • I will not cover IPv6 addressing in this guide, but you could use this
  • Ensure you request for appropriate prefix allocation based on your customer base from your Regional Internet registry/Local Internet registry
  • Follow the proper guidelines and BCOPs

Now I will cover a simple configuration use-case where a BNG has exactly 1000 customers. The goal here is to ensure that the WAN side of each customer gets a /64 and the LAN side gets a /56.

  • Disable redirects
    /ipv6 settings set accept-redirects=no
  • Modify the parameters for Neighbour Discovery Protocol (these values ensures quick discovery)
    • /ipv6 nd set [ find default=yes ] ra-interval=30s-1m
  • Modify the parameters for SLAAC as per this
    • /ipv6 nd prefix default set preferred-lifetime=45m valid-lifetime=1h30m
  • Next need to create two separate pools, one for WAN and one for LAN side of the customer
    • /ipv6 pool
      add name=Customer-CPE-LAN prefix=2405:a140:8::/46 prefix-length=56
      add name=Customer-CPE-WAN prefix=2405:a140:f:d400::/54 prefix-length=64
      • Here, prefix-length specifies what prefix length the customer gets, which in this case as per standards, we are giving the WAN side a /64 and the LAN side a /56
  • And finally configure the pools to each PPPoE Profile like below
    /ppp profile
    set *0 dhcpv6-pd-pool=Customer-CPE-LAN remote-ipv6-prefix-pool=Customer-CPE-WAN
    add name=profile2 dhcpv6-pd-pool=Customer-CPE-LAN remote-ipv6-prefix-pool=Customer-CPE-WAN
    • Remote IPv6 prefix is for WAN side of the customer
    • DHCPv6 PD Pool is for the LAN side of the customer
Figure-15 (PPPoE IPv6 configuration)

That’s it, now the customers will dynamically get a routed /64 and routed /56 for WAN and LAN sides respectively.

If everything is followed correctly including the IPv6 firewalling below, the outcome should be a perfect score like this.

Firewall/Security

Issues

  • Blocks inbound ports based on the false logic of “protecting” the customer
    • Port blocking does nothing to improve security, it only breaks legitimate traffic such as apps or games that use various methods for VoIP
    • Malware can make use of port 443 and that is the reality of modern-day malware anyway
  • Net Neutrality Violations
    • Such as blocking TCP/UDP traffic destined towards Cloudflare or Google Anycast DNS
  • Lacks basic DDoS protection
  • Lacks simple bogon filtering
  • Lacks basic rules such as dropping invalid traffic on the input chain
  • Lacks FastTracking for traffic destined towards your NATted pools

Solutions

  • Remove most “port blocking” rules
    • Customer Site security should be handled on the customer site such as having proper basic firewalling on their Edge Routers
    • I’ve dropped some ports on RAW table directly
  • Avoid Net Neutrality Violation unless otherwise enforced by your local state or central government
  • I’ve shared the rule for FastTracking NATted pools

Below are the generic firewall rules that should be deployed on the BNG to cover basic security grounds.

IPv4 Firewall

#First we take care of address lists#
/ip firewall address-list

#Enter all local subnets/public subnets applicable to your ASN, use the full CIDR notation of the public IPv4 block assigned to you to avoid missing anything out, please avoid something like /30#

add address=example_public/24 comment="Public Pools" list=lan_subnets
add address=example_local_private/24 comment="Local interfaces" list=lan_subnets
add address=100.64.0.0/10 comment="CGNAT Pool" list=lan_subnets

#Create an address list containing all CGNAT subnets for use as dst-address-list in the drop !dst-NATted rule on the forward chain, aggregate the prefix (in this case /10)#
add address=100.64.0.0/10 comment="CGNAT subnets" list=cgnat_subnets

###Required for DDoS protection rules###
add list=ddos-attackers
add list=ddos-targets

###Bogon filtering addresses for each of the rules in RAW/Filter###
add address=0.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=172.16.0.0/12 comment=RFC6890 list=not_in_internet
add address=192.168.0.0/16 comment=RFC6890 list=not_in_internet
add address=10.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=169.254.0.0/16 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=224.0.0.0/4 comment=Multicast list=not_in_internet
add address=198.18.0.0/15 comment=RFC6890 list=not_in_internet
add address=192.0.0.0/24 comment=RFC6890 list=not_in_internet
add address=192.0.2.0/24 comment=RFC6890 list=not_in_internet
add address=198.51.100.0/24 comment=RFC6890 list=not_in_internet
add address=203.0.113.0/24 comment=RFC6890 list=not_in_internet
add address=100.64.0.0/10 comment=RFC6890 list=not_in_internet
add address=240.0.0.0/4 comment=RFC6890 list=not_in_internet
add address=192.88.99.0/24 comment="6to4 relay Anycast [RFC 3068]" list=not_in_internet
add address=255.255.255.255 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="RAW Filtering - RFC6890 reserved" list=bad_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="RAW Filtering - RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_dst_ipv4

/ip firewall raw
add action=drop chain=prerouting comment="Drop DDoS src and dst address list" dst-address-list=ddos-targets src-address-list=ddos-attackers

add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=tcp
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=udp

#Required at least in India to reduce call spam/scam#
add action=drop chain=prerouting comment="Drop outgoing SIP to block call centre scammers" port=5060,5061 protocol=tcp
add action=drop chain=prerouting comment="Drop outgoing SIP to block call centre scammers" port=5060,5061 protocol=udp

add action=accept chain=prerouting comment="Enable this rule for transparent mode" disabled=yes

#If you are using DHCP, change this to accept#
add action=drop chain=prerouting comment="defconf: Drop DHCP discover" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68

add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_in_internet

#Remember to properly enter all subnets in the lan_subnet list for both your ASN public IPv4 blocks and CGNAT/local subnets#
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address-list=!lan_subnets

add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to ICMP chain" jump-target=icmp protocol=icmp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest"

add action=accept chain=icmp comment="defconf: echo reply" icmp-options=0:0 protocol=icmp
add action=accept chain=icmp comment="defconf: net unreachable" icmp-options=3:0 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable" icmp-options=3:1 protocol=icmp
add action=accept chain=icmp comment="defconf: protocol unreachable" icmp-options=3:2 protocol=icmp
add action=accept chain=icmp comment="defconf: port unreachable" icmp-options=3:3 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable fragmentation required" icmp-options=3:4 protocol=icmp
add action=accept chain=icmp comment="defconf: echo request" icmp-options=8:0 protocol=icmp
add action=accept chain=icmp comment="defconf: time exceeded " icmp-options=11:0-255 protocol=icmp
add action=accept chain=icmp comment="defconf: allow parameter bad" icmp-options=12:0 protocol=icmp
add action=drop chain=icmp comment="defconf: drop other icmp" protocol=icmp
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

/ip firewall filter
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid
add action=accept chain=input comment="defconf: accept ICMP after RAW" protocol=icmp

#Example to allow access to router's ports from all interfaces LAN/WAN#
add action=accept chain=input comment="Accept Winbox TCP" dst-port=65000 protocol=tcp
add action=accept chain=input comment="Accept API TCP" dst-port=8728 protocol=tcp
add action=accept chain=input comment="Accept API UDP" dst-port=8728 protocol=udp
add action=accept chain=input comment="Accept SNMP for internal use" dst-port=161 protocol=udp
add action=accept chain=input comment="Accept RADIUS UDP" dst-port=1700,1812,1813 protocol=udp
add action=accept chain=input comment="Accept RADIUS TCP" dst-port=1700,1812,1813 protocol=tcp

add action=drop chain=input comment="defconf: drop all not coming from LAN's interface list/subnets" in-interface-list=!LAN

#PPPoE Clients are excluded as to not bypass queues, if using DHCP excluded src and dst address list of customer pool#
add action=fasttrack-connection chain=forward comment="Rule for NAT Accelaration behaviour (Will reduce CPU usage for NATted traffic)" in-interface=!all-ppp out-interface=!all-ppp

add action=accept chain=forward comment="allow already established connections" connection-state=established,related,untracked

add action=jump chain=forward comment="Jump to DDoS detection" connection-state=new in-interface-list=WAN jump-target=detect-ddos
add action=return chain=detect-ddos dst-limit=50,50,src-and-dst-addresses/10s
add action=return chain=detect-ddos dst-limit=50,50,src-and-dst-addresses/10s protocol=tcp tcp-flags=syn,ack
add action=add-dst-to-address-list address-list=ddos-targets address-list-timeout=10m chain=detect-ddos
add action=add-src-to-address-list address-list=ddos-attackers address-list-timeout=10m chain=detect-ddos

add action=drop chain=forward comment="Drop tries to reach not public addresses from LAN" dst-address-list=not_in_internet in-interface-list=LAN out-interface-list=WAN

add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed" connection-nat-state=!dstnat connection-state=new dst-address-list=cgnat_subnets in-interface-list=WAN

IPv6 Firewall

/ipv6 firewall address-list

#Enter all the public prefixes that you've routed to this particular BNG#
#We will use this to block spoofed IPv6 coming from customers#

#example#
add address=2405:a140:8::/46 comment="CPE-LAN-Pool" list=lan_subnets
add address=2405:a140:c::/54 comment="CPE-WAN-Pool" list=lan_subnets

#Copy Paste all the following#
add address=fd12:672e:6f65:8899::/64 list=allowed
add address=fe80::/16 list=allowed
add address=ff02::/16 comment="multicast" list=allowed
add address=fe80::/10 comment="defconf: RFC6890 Linked-Scoped Unicast" list=no_forward_ipv6
add address=ff00::/8 comment="defconf: multicast" list=no_forward_ipv6
add address=::1/128 comment="defconf: lo" list=bad_ipv6
add address=fec0::/10 comment="defconf: site-local" list=bad_ipv6
add address=::ffff:0:0/96 comment="defconf: ipv4-mapped" list=bad_ipv6
add address=::/96 comment="defconf: ipv4 compat" list=bad_ipv6
add address=100::/64 comment="defconf: discard only " list=bad_ipv6
add address=2001:db8::/32 comment="defconf: documentation" list=bad_ipv6
add address=2001:10::/28 comment="defconf: ORCHID" list=bad_ipv6
add address=3ffe::/16 comment="defconf: 6bone" list=bad_ipv6
add address=2001::/23 comment="defconf: RFC6890" list=bad_ipv6
add address=100::/64 comment="RAW Filtering - RFC6890 Discard-only" list=not_global_ipv6
add address=2001::/32 comment="RAW Filtering - RFC6890 TEREDO" list=not_global_ipv6
add address=2001:2::/48 comment="RAW Filtering - RFC6890 Benchmark" list=not_global_ipv6
add address=fc00::/7 comment="RAW Filtering - RFC6890 Unique-Local" list=not_global_ipv6
add address=::/128 comment="defconf: unspecified" list=bad_dst_ipv6
add address=::/128 comment="RAW Filtering" list=bad_src_ipv6
add address=ff00::/8 comment="RAW Filtering" list=bad_src_ipv6

/ipv6 firewall raw
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=tcp
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=udp

add action=accept chain=prerouting comment="defconf: enable for transparent firewall" disabled=yes
add action=drop chain=prerouting comment="defconf: drop bogon IP's" src-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop bogon IP's" dst-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad src ipv6" src-address-list=bad_src_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad dst ipv6" dst-address-list=bad_dst_ipv6
add action=drop chain=prerouting comment="defconf: drop non global from WAN" src-address-list=not_global_ipv6 in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept local multicast scope" dst-address=ff02::/16
add action=drop chain=prerouting comment="defconf: drop other multicast destinations" dst-address=ff00::/8
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=drop chain=prerouting comment="defconf: drop the rest"

add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

/ipv6 firewall filter
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid
add action=accept chain=input comment="defconf: accept ICMPv6" protocol=icmpv6
add action=accept chain=input comment="defconf: accept UDP traceroute" port=33434-33534 protocol=udp
add action=accept chain=input comment="defconf: accept DHCPv6-Client prefix delegation." dst-port=546 protocol=udp src-address=fe80::/16

#Example to allow access to router's ports from all interfaces LAN/WAN#
add action=accept chain=input comment="Accept Winbox TCP" dst-port=65000 protocol=tcp
add action=accept chain=input comment="Accept API TCP" dst-port=8728 protocol=tcp
add action=accept chain=input comment="Accept API UDP" dst-port=8728 protocol=udp
add action=accept chain=input comment="Accept SNMP for internal use" dst-port=161 protocol=udp
add action=accept chain=input comment="Accept RADIUS UDP" dst-port=1700,1812,1813 protocol=udp
add action=accept chain=input comment="Accept RADIUS TCP" dst-port=1700,1812,1813 protocol=tcp

add action=accept chain=input comment="allow allowed addresses" src-address-list=allowed
add action=drop chain=input comment="defconf: drop everything else not coming from LAN" in-interface-list=!LAN

add action=accept chain=forward comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=forward comment="defconf: drop bad forward IPs" src-address-list=no_forward_ipv6
add action=drop chain=forward comment="defconf: drop bad forward IPs" dst-address-list=no_forward_ipv6
add action=drop chain=forward comment="defconf: rfc4890 drop hop-limit=1" hop-limit=equal:1 protocol=icmpv6
add action=drop chain=forward comment="drop private prefixes from trying to reach customers" in-interface-list=WAN src-address-list=allowed
add action=drop chain=forward comment="Drop spoofed traffic from LAN going to WAN" in-interface-list=LAN out-interface-list=WAN src-address-list=!lan_subnets

For Edge Router

The purpose of the Edge router is to route as fast as possible. So, with that in mind, along with the basic general changes I’ve mentioned at the beginning of this article such as:

  1. No NAT
  2. No connection tracking
  3. No fancy “features” (like Hotspot, PPPoE)
  4. Use your BNG routers for any customer delegation that is required

We only need to do two things:

  1. Make use of BGP Route Filtering to discard certain things
    • Such as dropping own prefixes from the outside
    • Dropping default routes from all peers etc
  2. Use the RAW table to drop remaining bogon/garbage traffic similar to the one used on the BNG
    • CPU usage stays minimal when using the RAW table
    • Absolutely nothing on the filter table

IPv4 Firewall

/ip firewall address-list
#Enter all local subnets/public subnets applicable to your ASN, use the full CIDR notation of the public IPv4 block assigned to you to avoid missing anything out, please avoid something like /30#

add address=example_public/24 comment="LAN subnets" list=lan_subnets
add address=example_local_private/24 comment="LAN subnets" list=lan_subnets

add address=0.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=172.16.0.0/12 comment=RFC6890 list=not_in_internet
add address=192.168.0.0/16 comment=RFC6890 list=not_in_internet
add address=10.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=169.254.0.0/16 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=224.0.0.0/4 comment=Multicast list=not_in_internet
add address=198.18.0.0/15 comment=RFC6890 list=not_in_internet
add address=192.0.0.0/24 comment=RFC6890 list=not_in_internet
add address=192.0.2.0/24 comment=RFC6890 list=not_in_internet
add address=198.51.100.0/24 comment=RFC6890 list=not_in_internet
add address=203.0.113.0/24 comment=RFC6890 list=not_in_internet
add address=100.64.0.0/10 comment=RFC6890 list=not_in_internet
add address=240.0.0.0/4 comment=RFC6890 list=not_in_internet
add address=192.88.99.0/24 comment="6to4 relay Anycast [RFC 3068]" list=not_in_internet
add address=255.255.255.255 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="RAW Filtering - RFC6890 reserved" list=bad_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="RAW Filtering - RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_dst_ipv4

/ip firewall raw
add action=accept chain=prerouting comment="Enable this rule for transparent mode" disabled=yes

#If you are using DHCP, change this to accept#
add action=drop chain=prerouting comment="defconf: Drop DHCP discover on LAN" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68

add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address-list=!lan_subnets
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to ICMP chain" jump-target=icmp protocol=icmp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest"

add action=accept chain=icmp comment="defconf: echo reply" icmp-options=0:0 protocol=icmp
add action=accept chain=icmp comment="defconf: net unreachable" icmp-options=3:0 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable" icmp-options=3:1 protocol=icmp
add action=accept chain=icmp comment="defconf: protocol unreachable" icmp-options=3:2 protocol=icmp
add action=accept chain=icmp comment="defconf: port unreachable" icmp-options=3:3 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable fragmentation required" icmp-options=3:4 protocol=icmp
add action=accept chain=icmp comment="defconf: echo request" icmp-options=8:0 protocol=icmp
add action=accept chain=icmp comment="defconf: time exceeded " icmp-options=11:0-255 protocol=icmp
add action=accept chain=icmp comment="defconf: allow parameter bad" icmp-options=12:0 protocol=icmp
add action=drop chain=icmp comment="defconf: drop other icmp" protocol=icmp
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

IPv6 Firewall

/ipv6 firewall address-list

#Enter the aggregated prefixes assigned to your ASN that you use for LAN#

#example#
add address=2405:a140::/32 comment="ASN Prefix" list=lan_subnets

#Copy Paste all the following#
add address=fd12:672e:6f65:8899::/64 list=allowed
add address=fe80::/16 list=allowed
add address=ff02::/16 comment=multicast list=allowed
add address=fe80::/10 comment="defconf: RFC6890 Linked-Scoped Unicast" list=no_forward_ipv6
add address=ff00::/8 comment="defconf: multicast" list=no_forward_ipv6
add address=::1/128 comment="defconf: lo" list=bad_ipv6
add address=fec0::/10 comment="defconf: site-local" list=bad_ipv6
add address=::ffff:0:0/96 comment="defconf: ipv4-mapped" list=bad_ipv6
add address=::/96 comment="defconf: ipv4 compat" list=bad_ipv6
add address=100::/64 comment="defconf: discard only " list=bad_ipv6
add address=2001:db8::/32 comment="defconf: documentation" list=bad_ipv6
add address=2001:10::/28 comment="defconf: ORCHID" list=bad_ipv6
add address=3ffe::/16 comment="defconf: 6bone" list=bad_ipv6
add address=2001::/23 comment="defconf: RFC6890" list=bad_ipv6
add address=100::/64 comment="RAW Filtering - RFC6890 Discard-only" list=not_global_ipv6
add address=2001::/32 comment="RAW Filtering - RFC6890 TEREDO" list=not_global_ipv6
add address=2001:2::/48 comment="RAW Filtering - RFC6890 Benchmark" list=not_global_ipv6
add address=fc00::/7 comment="RAW Filtering - RFC6890 Unique-Local" list=not_global_ipv6
add address=::/128 comment="defconf: unspecified" list=bad_dst_ipv6
add address=::/128 comment="RAW Filtering" list=bad_src_ipv6
add address=ff00::/8 comment="RAW Filtering" list=bad_src_ipv6

/ipv6 firewall raw
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=tcp
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=udp

add action=accept chain=prerouting comment="defconf: enable for transparent firewall" disabled=yes
add action=drop chain=prerouting comment="defconf: drop bogon IP's" src-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop bogon IP's" dst-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad src ipv6" src-address-list=bad_src_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad dst ipv6" dst-address-list=bad_dst_ipv6
add action=drop chain=prerouting comment="defconf: drop non global from WAN" src-address-list=not_global_ipv6 in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept local multicast scope" dst-address=ff02::/16
add action=drop chain=prerouting comment="defconf: drop other multicast destinations" dst-address=ff00::/8
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=drop chain=prerouting comment="defconf: drop the rest"

add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment=defconf protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

Firewall Explanation

I will keep this concise as stated earlier I suggest you study and understand how iptables function in general and study the packet flow to know what rule does what: With that being said, I will break it down into simpler points

  • I used this and this as source for building the base for the firewall
    • MikroTik has ensured to conform to various RFCs and taken the efforts to not break any legitimate protocol/traffic
  • IPv6 firewall rules are trickier and more complex, but rest assured that the rules in this article do not break any protocol/standard nor do they impact customer’s end-to-end reachability
  • We are dropping spoofed traffic
    • The RAW rules drop anything coming from WAN that’s spoofed (RFC 6890 addresses)
    • The RAW rules drop anything coming from LAN that does not match your public prefixes/internal subnets (aka lan_subnets address list), meaning any spoofing traffic is dropped from exiting your network
    • Here’s an APNIC blog post detailing more on this subject
  • Next, we are dropping bad traffic such as TCP/UDP port 0 or bad TCP flags
  • The filter rules are pretty self-explanatory
  • We aren’t using filter rules on Edge Router as it requires connection tracking for certain rules which should be disabled in the first place for an edge router

Strange Anomalies

These are some strange behaviours that I could not explain. If you have further information, please reach out to me.

  1. NAT Leak
    • For example, let’s say we CGNAT 100.64.0.0/24 to customers with 1.0.0.0/26. Now, it’s common sense that anything WAN bound will have a source IP belonging to the /26 on the other end of the NAT. But nope, this isn’t always the case. What I have observed is, sometimes (meaning all the time if you have thousands of customers), the source IP would be the CGNAT subnet and the destination IP would be public, hence it “escapes” from the NAT engine
    • This behaviour is NOT exclusive to MikroTik. I have observed the same thing on Ubuntu 20.04/Debian based distros, where source IP is the NAT subnet and it escapes to the WAN interface with destination IP being real-live public IPs
      • Solution: We just drop anything coming from the BNG that’s not public using the Edge Router, this is already taken care of in my configuration above, you just need to follow the instructions
    • I have been unable to find documentation or bug reports on this behaviour
  2. Netmap vs Src Nat
    • Publicly available documentation suggests simple definitions for both
      Src NAT = 1:Many binding
      Netmap = 1:1 binding
      • But for whatever reason, when using src NAT as the action for a public prefix, it keeps on changing the “NATted” public IP and hence the source IP on the WAN for the customers. This results in traffic breaking or triggering DDoS protection on sites like Cloudflare protected ones
      • And for whatever reason, even though Netmap is meant for 1:1, it works for 1:Many bindings and it does not result in the constant changing of source IP for the customers
    • I have not found any technical information on why these behaviours occur or why netmap even works in the first place for 1:Many bindings
Published inISPsNetworking

11 Comments

  1. Rupam Kumar Sharma Rupam Kumar Sharma

    Such a detailed address of the issues that is so important while in implementation…

    Thanks

  2. Hi Daryll, well done!!!, it was the key for fix a problem I had been triyng to fix in a customer (ISP).
    I would like you could read my problem: https://forum.mikrotik.com/viewtopic.php?f=2&t=176378
    I repeat, That problem its fixed now, thanks of you!.

    I used the following command of your article (with a little modifications):
    /ip firewall nat
    add action=netmap chain=srcnat comment=”NETMAP PPPoE” out-interface=sfp1-Internet src-address-list=Clientes_NAT to-addresses=PUBLIC/32

    I don’t understand what is the difference using “srcnat action masquerade” (witch it wasn’t working) and using “Netmap” (witch for shure it runned perfectly fine at the first moment that I put it). I want to learn/understand why this way is working.

    Thanks a lot.
    Regards from Argentina

    • I hope this helps you.

      Just note that you are missing parameters in your rule, re-check from my article again.

    • For the NAT Leak issue, it mostly looks like speculation on that thread in my opinion. No factual/documented information yet, but I’ll keep an eye on it.

  3. Stefan Müller Stefan Müller

    unfortunately, that is the case. When the conclusion are a bit more solid I will email support anyway, may they resolve the mystery

  4. Good morning,
    could you share what was updated?
    thx 🙂

    • Well, it depends on when you last visited the site? 🙂

      Added:
      QoS/Bandwidth management suggestion, IPv6 for BNGs with PPPoE, IPv6 tweaks, IPv6 firewalling for both Edge and BNGs, slightly tweaked the IPv4 firewall rules for both, MTU section is finalised, CGNAT section is finalised. That’s about it I think.

  5. that is true :), it was end if July.
    I received an notification from WordPress yesterday that a new post was added.
    As there was not any, I guessed it was due to the update of the blog.
    I don’t know if notification are sent as well if the blog is updated

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.