Skip to content

Edge Router & BNG Optimisation Guide for ISPs

Last updated on 17 May 2022

Introduction

This guide will be based on MikroTik RouterOS syntax, but it should not be too hard to replicate the same config on other platforms. The content of this post/document will likely be constantly updated for the foreseeable future as I come across various use-cases, new technology, more efficient and simpler configurations etc.

Generally speaking, a lot of ISPs in APAC use MikroTik RouterOS to provide access to their customers via PPPoE (please get on board with DHCP!) via BNGs and some ISPs use MikroTik for their edge/core routers as well. We will walk through some of the issues I have come across and the solutions for them.

This article was also published in the APNIC Blog, but it is not frequently updated there and hence I would advise folks to stick with the source here.

Update frequency to the content is proportional to the ASes that are willing to cooperate with me and play around with the config such as IPv6/BGP optimisation etc. At the moment this is strictly IPv4. I do have the IPv6 config ready on paper, but I need to test it out in real-time. If you’re an AS operator and you’re interested in working something out you can always reach out to me. I now have my own AS149794 and directly conduct research and experiments on it.

  • The configuration was originally tested and deployed on AS135756 with my good friend Mr Varun Singhania (proprietor of the AS)
  • Further configuration was extensively tested on AS132559 as I was a downstream customer and was able to test the impacts/config changes directly as both an end-user and also as a consultant for them as of April 2022
  • As of 2022, I further solidified the configuration by testing it on my own network, especially the firewall rules, which should now cover all bases and work in any environment as long as you follow the comments and step by step instructions. I can confirm, that they don’t break layer 4 protocols nor create problems for the end-user in the last mile.

Keep in mind that MikroTik uses RouterOS, which is based on the Linux Kernel. However, RouterOS v6 stable/LTS runs on an ancient version of the Linux kernel (Hopefully, they implement newer frameworks for their RouterOS v7 – MikroTik is still using iptables on v7, the good news is my configuration in this post is 100% compatible with v7) and hence uses the legacy iptables for packet filtering. So, if you want to thoroughly understand the logic flow behind these suggestions/rules, I’d suggest going through Linux Kernel documentation on the web.

I will assume the reader has some basic knowledge of the terminologies and technologies/protocols used in typical BNG/CGNAT configuration and hence I will not elaborate on the definition and on what a protocol or technology does or does not do. This guide is meant for engineers/ISPs and not home users.

Also note this is not a guide for how to architect your network, Kevin Myers from IPArchiTechs has done a great job of that here. This guide is, more of a focus on layer 2/3/4 (and up to layer 7 in some cases) configuration aspects and following RFC compliant behaviour and BCOPs.

Basic terminology and overview of router types

  • An edge or border router is an eBGP router used for transit, peering, IXP, PNIs
    • It should always be stateless in nature (no firewall filter rules, NAT)
      • If you make it stateful, it will die on a DDoS attack and will also have slow routing performance
    • Never use this router for customer delegation as then it becomes stateful
  • A core router is usually not present in a modern network that follows the collapsed core topology
  • BNGs are the routers one should use for customer delegation (PPPoE, DHCP, CGNAT etc) and is stateful in nature. Some folks will call this a BRAS or NAS as well, all three terms in my opinion are just synonyms and are correct.
    • Can also be called access layer routers

General Configuration Changes

I was surprised to find a few (which should ideally be zero) networks not implementing basic security features on their routers such as not using Telnet in this day and age, exposing Neighbour Discovery etc, running on outdated RouterOS and outdated firmware etc.

  • Upgrade RouterOS to the latest long-term release
    • Upgrade the firmware as well after the above, always ensure RouterOS and firmware are running on matching versions to avoid bugs and instability issues
      • /system routerboard settings set auto-upgrade=yes
      • Remember to reboot two times for the firmware update to take effect!
  • MikroTik already has a guide for basic security measures here, I strongly recommend the reader makes use of the measures
  • Make use of Reverse Path Filtering (Slightly different from Unicast Reverse Path Forwarding)
    • It is found in IP>Settings>rp-filter
      • Always use loose mode on the edge router
        • The same on wherever asymmetric or policy routing takes place
      • Use strict mode on BNG and/or wherever symmetric routing takes place
  • Remember to make use of interface lists inside Interfaces>Interface List on all routers
    • WAN for all public interface/internet bound
    • LAN for all local interfaces/customer-facing side
      • Remember to include dynamic interfaces for LAN on BNG to account for all PPPoE users
      • Also include Layer 3 subinterfaces such as VLANs, Bonding interfaces etc
Figure-1 (LAN Include Dynamic)
  • Connection Tracking
    • Disable it on the Edge Router
      • /ip firewall connection tracking set enabled=no
    • Enable loose TCP tracking on all routers including Edge
      • /ip firewall connection tracking loose-tcp-tracking=yes
    • Use the following connection tracking timeout values on all routers
      • The reason behind this: We saw real-time improvements to stability and performance, especially for UDP traffic such as VoIP, VoWiFi, gaming, P2P UDP NAT punching etc
      • Upgrade the RAM if you can’t accommodate these values!
Figure-2 (Recommended Connection Tracking Timeout Values)
  • My personal favourites
    • Give the router some accurate system clock
      • /system ntp client set enabled=yes server-dns-names=time.cloudflare.com

MTU

I had to create this section as I have seen way too many ASes with horrible MTU config.

First, you need to fix MTU across the entire path of your network devices before deploying RFC4638/TCP MSS Clamping algorithm, and of course to allow PMTUD to work correctly. Otherwise, it would either fail or break traffic.

Issues

  • Wrong MTU configuration across the entire path of network devices
    • Switches, routers, hypervisors etc
  • Breaks PMTUD and creates a bunch of problems right at Layer 2/3 on the local network
    • Internet-bound traffic would likely fail

Solutions

  • Layer 2 MTU
    • Set it to the maximum supported value on all ethernet interfaces
      • Such as routers, switches, hypervisors, virtualised instances etc
        • Example: Edge (L2 MTU 10k)>BNG (L2 MTU 10k)>Switch (L2 MTU 10k)>Wireless AP (L2 MTU 2k)>Customer
        • Example: Edge (L2 MTU 10k)>BNG (L2 MTU 10k)>Switch (L2 MTU 10k)>OLT (L2 MTU 10k)>Customer
    • It can vary from vendor/to model, but that’s okay
    • By using the maximum supported L2 MTU everywhere, we are pumping jumbo frames and eliminating any chances of fragmentation or performance issues down the road for encapsulation
  • Layer 3 MTU
    • Set it to 1600 on all LAN (local) interfaces
      • This applies to L3 VLAN MTU, Bonding interfaces etc
        • If using Stacked VLANs, both S and C VLANs should have equal L3 MTU of 1600
        • Or any other encapsulation protocol that allows the use of L3 MTU on top of the Layer 2 interface such as VPLS
      • To allow for proper utilisation of encapsulation protocols’ overhead and allow RFC4638
    • Set it to 1500 on all WAN (public) interfaces
      • Obviously, we cannot pump jumbo frames to the internet
Figure-3 [Ethernet MTU (Jumbo Frames on L2, L3 = 1500 for WAN and 1600 for LAN)]
Figure-4 [L3 MTU for Bonding interfaces (L3 = 1500 for WAN and 1600 for LAN)]
Figure-5 (VLAN L3 MTU = 1600)
Figure-6 [QinQ (Stacked VLANs) L3 MTU = 1600 on both]

However, keep in mind, that some vendors just have “MTU” per interface and this is Layer 3/IP MTU, where the actual Layer 2 MTU would most probably be auto-adjusted by the firmware to accommodate whatever value you set on the L3 MTU. Regardless, whatever is shown above on RouterOS is applicable to any other vendors as well.

IPv4

I have noticed a lot of operators talking about how short they are on IPv4 addresses – Yet for unknown reasons they like to waste 2 extra addresses for every PTP or inter-router link by using a /30. Please, stop doing that and start using /31s for PTP links as per RFC3021.

Example below:
Prefix: 1.0.0.0/31

#RouterOS v6 has a bug with the CIDR notation for IPv4 /31s, we need to manually specify the network and interface address without CIDR notation#
#Router A#
/ip address
add address=1.0.0.0 interface=ether1 network=1.0.0.1 comment="/31 Example"
#Router B#
/ip address
add address=1.0.0.1 interface=ether1 network=1.0.0.0 comment="/31 Example"
#In RouterOS v7, it is as you'd expect, just use CIDR notation#
#Router A#
/ip address
add address=1.0.0.0/31 interface=ether1 comment="/31 Example"
#Router B#
/ip address
add address=1.0.0.1/31 interface=ether1 comment="/31 Example"

IPv6

As per RFC6164, it is advised to use /127s on PTP links to avoid various forms of network attacks described in the RFC.

However, for ease of management and subnetting, I would advise not to subnet longer (smaller) than a /64. Say you have a /48 that you’d like to use for backbone/core/PTP, subnet it directly to /64s out of which, you can use a /127 from each /64 per PTP link i.e. a /64 is reserved for only one PTP link – This ensures there’s room for growth in the future in case your link/network grows and a /127 is no longer sufficient.

Note that on MikroTik, /127s do not work with BGP for unknown reasons and hence the longest prefix size we can use would be a /126.

Example below:
Prefix: 2400:7060::/126

#Advertise=no because we aren't using SLAAC#
/ipv6 address
add address=2400:7060::1/126 advertise=no comment="Peering with Transit" interface=ether1

However, if you look closely, you might’ve noticed that I avoided using the initial zeroes leading interface ID “2400:7060::/126″ and instead used “2400:7060::1/126″. The reason is this is, that in some routers, using the “::” (all leading zeroes) interface ID (address) on a link could cause strange behaviours.

Routing loops with RFC6890 space

I have observed that in most of the networks, including my own personal home lab (AS149794), I find a lot of traffic where source IP = my end hosts or CPE WAN IP (either it is CGNAT IP or public IP), but destination IP = unused RFC6890 blocks. This is why I (and MikroTik themselves) created a forward rule to drop RFC6890 from escaping to WAN.

Now let us step back and think about this: The majority of the ISPs do not implement these filter rules, which means that traffic from customers whereby dst-IP=RFC6890 is forwarded from their CPE to the BNGs, and from there the underlying L3/L2 paths will carry it all the way to the edge router, where further, goes towards your transit or peers if there is a default route. If there is no default route or more specific route for any given dst-IP matching RFC6890 blocks, it would simply loop back and forth until the TTL expires, which means wasted resources, CPU and bandwidth when your network is at scale and you have thousands of customers. So in order to solve this with a quick fix, I derived a simple yet effective solution – Route RFC6890 blocks to blackhole.

We route all RFC6890 space to black hole directly on the edge routers for well edge cases, but we will also do the same on the BNGs directly.

It will not impact your use of the private space for any given interface/servers etc – Because remember, more specific prefixes always win and hence your private /24s etc will always be preferred over the less specific /10 for example and hence will be accessible. Someone on the MikroTik forum has discussed this a bit, in the past.

IPv4

#RouterOS v7#
#Copy and paste these on both Edge and BNG routers#
/ip route
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=0.0.0.0/8
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=172.16.0.0/12
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.168.0.0/16
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=10.0.0.0/8
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=169.254.0.0/16
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=127.0.0.0/8
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=224.0.0.0/4
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=198.18.0.0/15
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.0.0.0/24
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.0.2.0/24
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=198.51.100.0/24
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=203.0.113.0/24
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=100.64.0.0/10
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=240.0.0.0/4
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.88.99.0/24
add blackhole comment="Blackhole route for RFC6890 (limited broadcast)" disabled=no dst-address=255.255.255.255/32
#RouterOS v6#
#Copy and paste these on both Edge and BNG routers#
/ip route
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=0.0.0.0/8
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=172.16.0.0/12
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.168.0.0/16
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=10.0.0.0/8
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=169.254.0.0/16
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=127.0.0.0/8
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=224.0.0.0/4
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=198.18.0.0/15
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.0.0.0/24
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.0.2.0/24
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=198.51.100.0/24
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=203.0.113.0/24
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=100.64.0.0/10
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=240.0.0.0/4
add type=blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=192.88.99.0/24
add type=blackhole comment="Blackhole route for RFC6890 (limited broadcast)" disabled=no dst-address=255.255.255.255/32

IPv6

#RouterOS v7#
#Copy and paste these on both Edge and BNG routers#
/ipv6 route
add blackhole comment="Blackhole route for RFC6890" disabled=no dst-address=::1/128
add blackhole comment="Blackhole route for RFC6890" disabled=no dst-address=::/128
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=64:ff9b::/96
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=::ffff:0:0/96
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=100::/64
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=2001::/23
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=2001::/32
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=2001:2::/48
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=2001:db8::/32
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=2001:10::/28
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=2002::/16
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=fc00::/7
add blackhole comment="Blackhole route for RFC6890 (aggregated)" disabled=no dst-address=fe80::/10
#In RouterOS v6, IPv6 blackhole is not supported#

For BNG

QoS/Queuing Mechanism

There have been decades-long debates on which algorithm to use, and which method to implement the best possible QoS mechanism.

In my testing, I observed the following:

  • Capping on a per-customer basis using a single simple queue worked best
  • As for the algorithm of choice
    • I pick SFQ due to the observed low jitter/bufferbloat phenomenon on the customer side
  • Bufferbloat tested using this tool
    • Keep in mind, high bufferbloat = bad, low bufferbloat = good

I have not included a screenshot for every algorithm as that’s unnecessary, but the test scenario was simple, SFQ compared to the rest of the algorithms, and the result was SFQ gave the best possible bufferbloat score in my testing.

Figure-7 (Simple Queue + PFIFO resulted in high bufferbloat)
Figure-8 (Simple Queue + SFQ resulted in low bufferbloat)

PPPoE

Issues

  • Packet fragmentation due to non-standard 1500 MTU/MRU
    • Typically, ISPs use 1492 or 1480 or some other strange MTU size
    • Both BNG device and customer router need to make use of hacks like TCP MSS Clamping to work around this
    • PMTUD is simply unreliable as per RFC 8900
      • Gets worse with CGNAT because remote end-points cannot determine the MTU of your PPPoE customer behind it
  • Lack of proper routing for PPPoE Clients (Interfaces or Inter-VLANs)
    • Most assume that using a single profile for different PPPoE Servers running on different interfaces will work fine

Solutions

  • Deploy RFC 4638
    • Keep in mind that in a network, MTU affects the whole path of L2/L3 devices whether physical or virtual, as long as you follow the MTU section above, you should be good
    • Simply set MTU and MRU to 1500 inside PPPoE Server on the BNG
Figure-9 (PPPoE Server MTU/MRU & TCP MSS Clamping config)
  • Disable (and delete!) TCP MSS Clamping rules inside IP>Firewall>Mangle
    • Why set some arbitrary value when you can let the engine determine automatically to ensure optimal performance?
      • MikroTik has long since allowed automatic TCP MSS Clamping
      • Make use of PPP>Profile>Default* to enable TCP MSS Clamping directly on the PPPoE engine. This will do the work for any customer whose MTU/MRU is less than 1500.
    • On the customer side, not all routers can take advantage of RFC4638, such as TP-Link, Tenda etc. For them, MTU will remain capped at 1492.
      • The 1492 limitation on their end won’t cause issues with packet fragmentation as packets would fragment at the source (their routers) before it exits the interface and hits the BNG and TCP Clamping on PPPoE engine takes care of anything coming in from the outside world toward the customer
      • I have observed 1500 MRU when pinging from the outside world. Suggesting some of these consumer routers support 1500 MRU
      • If they are using MikroTik, pfSense, VyOS etc, they can take advantage of RFC4638 aka 1500 MTU/MRU for their PPPoE Client
      • Some ONT/ONU devices have strange behaviour for MTU negotiation where they simply do not allow RFC4638 to work (even in bridge mode), only a few brands like GX, TP-Link, and Huawei have been found to be flawless in my personal testing.

Extra Note on PPPoE

  • Create a single CGNAT pool on a per BNG basis and you can use it for n Number of PPPoE Servers on n number of interfaces
    /ip pool
    add name=CGNAT_Pool comment="100.64.0.0-9 is reserved for each PPPoE Server Gateway/Profile" ranges=100.64.0.10-100.127.255.255
    • Here we are reserving 100.64.0.0-9 for gateway IPs on a per-interface/PPPoE server basis, assuming we only have 10 VLANs/Interfaces
      • Reserve as per your local requirements
  • Local Address in PPP Profile = Gateway IP address
    • One common mistake is using the router’s public IP from the WAN interface as the local address, which I’ve seen could lead to issues like traceroute failures or some strange packet loss, you should be using an address that does not exist in IP>Address
    • Each PPPoE Server needs unique profile/gateway in order to allow inter-VLAN communication between CPEs (which is needed to allow two customers behind a NATted IP to play a P2P Xbox game with each other on different VLANs) and will also ensure a clean network approach
      • If you have 100 PPPoE Servers, there should be 100 unique PPP Profiles with unique local addresses for each
    • Something like this for two servers:
/ppp profile
add change-tcp-mss=yes local-address=100.64.0.1 name=profile1 remote-address=CGNAT_Pool use-upnp=no
add change-tcp-mss=yes local-address=100.64.0.2 name=profile2 remote-address=CGNAT_Pool use-upnp=no
/interface pppoe-server server
add authentication=pap default-profile=profile1 interface=vlan20 keepalive-timeout=disabled max-mru=1500 max-mtu=1500 one-session-per-host=yes service-name=server1
add authentication=pap default-profile=profile2 disabled=no interface=vlan21 keepalive-timeout=disabled max-mru=1500 max-mtu=1500 one-session-per-host=yes service-name=server2

CGNAT

Issues

  • The majority of ISPs are using RFC1918 subnets for CGNAT and can clash with subnets on the customer site
  • Breaks P2P traffic
  • Kills the end-to-end principle
  • Requires proper NAT traversal for various protocols including IPsec
  • Routing Loops will occur for any traffic coming from the outside destined towards the public IP pools

Solutions

  • Make use of the 100.64.0.0/10 subnet as it’s meant for CGNAT usage to prevent clashing on the customer site
  • Enable the NAT traversal Helpers on the Router like the following inside IP>Firewall>Service Ports
Figure-10 (NAT Traversal Helpers on RouterOS)
  • Use a simple netmap rule with IPsec passthrough (will allow customers to initiate IPsec outbound without issues) configured.
  • Use a single NAT rule for all CGNAT customers on a per BNG basis to reduce CPU usage.
    • /ip firewall nat add
      action=netmap chain=srcnat comment="CGNAT rule" dst-address-list=!not_in_internet ipsec-policy=out,none out-interface-list=WAN src-address-list=cgnat_subnets
      to-addresses=public/25
      • Here cgnat_subnets=address list containing CGNAT subnets aka 100.64.0.0/10
      • dst-address-list=!not_in_internet is self-explanatory, anything destined towards private subnets shouldn’t be NATted towards WAN
        • Customers should be able to talk to each other using their CGNAT IP, Xbox makes use of this and is mentioned in RFC 7021. This is equivalent (sort of) to old school days of everyone having a public IP and hence is reachable
    • Enable port forwarding for entire ranges (netmap algorithm + state tracking will handle what gets mapped where)
      • /ip firewall nat
        add action=netmap chain=dstnat comment="Port Forwarding Solution for CGNAT (TCP)" dst-address=public/25 dst-port=1024-65535 protocol=tcp to-addresses=100.64.0.0/10


        add action=netmap chain=dstnat comment="Port Forwarding Solution for CGNAT (UDP)" dst-address=public/25 dst-port=1024-65535 protocol=udp to-addresses=100.64.0.0/10

Below is what MikroTik support had to say about my port forwarding rules

Figure-11 (MikroTik support suggests my port forwarding rules are correct)
  • Avoid Deterministic NAT, the above configuration allows P2P traffic initiated from the inside to be reachable from the outside with various applications that make use of ephemeral ports/UDP NAT punching/STUN etc
  • We were able to successfully seed the official Ubuntu Torrent behind the CGNAT with the above configuration, which can mean only one thing: P2P networking from in-bound established works!
Figure-12 (BitTorrent Seeding Behind CGNAT)
  • We tried with src nat as action for src NAT chain but it resulted in the NATted public IP constantly changing on the customer side and breaking things

Below is what MikroTik support had to say about netmap vs src nat as action for src nat chain

Figure-13 (Src nat = breaks P2P traffic | Netmap = static mapping per client IP)
  • Now we fix routing loops
    • We will use DST NAT to account for remaining traffic such as ICMP and NAT it to a loopback interface
      • Remember to add the bridge to LAN interface list & add the /31 to lan_subnets address list as well
/interface bridge
add arp=disabled comment="For Static Loop Protection" mtu=1500 name=loopback_1 protocol-mode=none
/ip address
add address=192.168.0.1/31 comment="For Static Loop Protection" interface=loopback_1 network=192.168.0.0
/ip firewall nat
add action=dst-nat chain=dstnat comment="Static Loop Protection" dst-address=public/25 to-addresses=192.168.0.1

Subscription Ratio Recommendation

In my extensive testing and observations, when using the above parameters and steps, I was able to have 200 users behind a /30 without any known complaints from them. BitTorrent worked as expected too, this is likely due to the obvious fact that not all users out of 200 will max out 65k connections and hence use up all the IP:Port combination. Where will you find a CPE that can handle 65k NAT entries anyways?

So tl;dr you can use a /30 per 200 users as long as you follow the steps properly and also to be future-proof and safe, ensure you provide IPv6 as well.

End Result

Figure-14 (Your NAT Table should look as dead simple as this one)

IPv6

Issues

  • Addressing may not be optimally subnetted/broken down
  • ISP may only have something like a single /48 with 5000 customers downstream which exceeds possible /56s out of the /48
  • Not following the proper guidelines for IPv6 deployment
  • Lack of persistent assignment feature on MikroTik
    • This applies to the majority of ISPs even though they may use Cisco, Juniper etc which supports persistent assignment configuration
  • Not properly ensuring that the customer’s WAN side gets a proper single /64
  • Forcing the customer to have only a single /64 on the LAN side instead of /56
  • MikroTik IPv6 RADIUS does not work correctly

Solutions

  • I will not cover IPv6 addressing in this guide, but you could use this
  • Ensure you request for appropriate prefix allocation based on your customer base from your Regional Internet registry/Local Internet registry
  • Follow the proper guidelines and BCOPs
  • I came across a solution for the lack of persistent assignment on MikroTik, simply use the following script and schedule it to run every five minutes:
    #Please don't be stupid enough to set owner=Daryll#
    /system script
    add dont-require-permissions=no name=PPPoE-IPv6-Persistent owner=Daryll policy=ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon source=\
    "/ipv6 dhcp-server binding;\r\
    \n:foreach i in=[find server~\"pppoe\"] do={\r\
    \n make-static \$i;\r\
    \n set \$i comment=[get \$i server];\r\
    \n set \$i server=all;\r\
    \n}"

    Use the scheduler for automating it:
    /system scheduler
    add interval=5m name=PPPoE-IPv6-Persistent-AutoUpdate on-event=PPPoE-IPv6-Persistent policy=ftp,reboot,read,write,policy,test,password,sniff,sensitive,romon start-time=startup

Now I will cover a simple configuration use-case where a BNG has exactly 1000 customers. The goal here is to ensure that the WAN side of each customer gets a /64 and the LAN side gets a /56.

  • Disable redirects
    /ipv6 settings set accept-redirects=no
  • Modify the parameters for Neighbour Discovery Protocol (these values ensure quick discovery)
    • /ipv6 nd set [ find default=yes ] ra-interval=30s-1m
  • Modify the parameters for SLAAC as per this
    • /ipv6 nd prefix default set preferred-lifetime=45m valid-lifetime=1h30m
  • Next need to create two separate pools, one for WAN and one for the LAN side of the customer
    • /ipv6 pool
      add name=Customer-CPE-LAN prefix=2405:a140:8::/46 prefix-length=56
      add name=Customer-CPE-WAN prefix=2405:a140:f:d400::/54 prefix-length=64
      • Here, prefix-length specifies what prefix length the customer gets, which in this case as per standards, we are giving the WAN side a /64 and the LAN side a /56
  • And finally, configure the pools to each PPPoE Profile as below
    /ppp profile
    set *0 dhcpv6-pd-pool=Customer-CPE-LAN remote-ipv6-prefix-pool=Customer-CPE-WAN
    add name=profile2 dhcpv6-pd-pool=Customer-CPE-LAN remote-ipv6-prefix-pool=Customer-CPE-WAN
    • Remote IPv6 prefix is for the WAN side of the customer
    • DHCPv6 PD Pool is for the LAN side of the customer
Figure-15 (PPPoE IPv6 configuration)

That’s it, now the customers will dynamically get a routed /64 and routed /56 for WAN and LAN sides respectively.

If everything is followed correctly including the IPv6 firewalling below, the outcome should be a perfect score like this.

Routing Loop prevention

If a customer happens to go offline (due to power loss etc), traffic destined for those customers will continue to persist until they time out leading to increased CPU usage. To solve this, we simply route aggregated customer prefixes to blackhole – Because remember in routing, more specific prefixes always win, so should those more specific prefixes go offline, the less specific (aggregated) routes take precedence in which case we are routing to blackhole and hence all pending traffic times out with immediate effect to give us optimal CPU usage.

#RouterOS v7 example#
/ipv4 route
add blackhole comment="Blackhole route for Customer CGNAT pool" disabled=no dst-address=public/25
add blackhole comment="Blackhole route for Customer public pool" disabled=no dst-address=1.0.0.0/24
/ipv6 route
add blackhole comment="Blackhole route for Customer LAN pool" disabled=no dst-address=2405:a140:8::/46
add blackhole comment="Blackhole route for Customer WAN pool" disabled=no dst-address=2405:a140:f:d400::/54
#RouterOS v6 example#
/ip route
add type=blackhole comment="Blackhole route for Customer CGNAT pool" disabled=no dst-address=public/25
add type=blackhole comment="Blackhole route for Customer public pool" disabled=no dst-address=1.0.0.0/24
#In RouterOS v6, IPv6 blackhole is not supported#

Firewall/Security

Issues

  • Blocks inbound ports based on the false logic of “protecting” the customer
    • Port blocking does nothing to improve security, it only breaks legitimate traffic such as apps or games that use various methods for VoIP
    • Malware can make use of port 443 and that is the reality of modern-day malware anyway
  • Net Neutrality Violations
    • Such as blocking TCP/UDP traffic destined towards Cloudflare or Google Anycast DNS
  • Lacks basic DDoS protection
  • Lacks simple bogon filtering
  • Lacks basic rules such as dropping invalid traffic on the input chain
  • Lacks FastTracking for traffic destined towards your NATted pools
  • Connection tracking of customers having a public IPv4 address makes no sense and wastes CPU cycles

Solutions

  • Remove most “port blocking” rules
    • Customer Site security should be handled on the customer site such as having proper basic firewalling on their Edge Routers
    • I’ve dropped some ports on the RAW table directly
  • Avoid Net Neutrality Violation unless otherwise enforced by your local state or central government
  • I’ve shared the rule for FastTracking NATted pools
  • I’ve shared the rule for reducing connection tracking impact on customers having public IPv4 address

Below are the generic firewall rules that should be deployed on the BNG to cover basic security grounds.

IPv4 Firewall

#First we take care of address lists#
/ip firewall address-list
#Enter all local subnets/public subnets applicable to your AS, use the full CIDR notation of the public IPv4 block assigned to you to avoid missing anything out, please avoid something like /30#
add address=example_public/24 comment="Public Pools" list=lan_subnets
add address=example_local_private/24 comment="Local interfaces" list=lan_subnets
add address=100.64.0.0/10 comment="CGNAT Pool" list=lan_subnets
#Enter only your public Pool prefixes, this will be used for no-tracking to boost performance of customers having public IPv4 addresses and reduce load on the CPU of the BNG#
add address=example_public/24 comment="Public Pool" list=public_subnets
###Required for DDoS protection rules###
add list=ddos-attackers
add list=ddos-targets
###Bogon filtering addresses for each of the rules in RAW/Filter###
add address=0.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=172.16.0.0/12 comment=RFC6890 list=not_in_internet
add address=192.168.0.0/16 comment=RFC6890 list=not_in_internet
add address=10.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=169.254.0.0/16 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=224.0.0.0/4 comment=Multicast list=not_in_internet
add address=198.18.0.0/15 comment=RFC6890 list=not_in_internet
add address=192.0.0.0/24 comment=RFC6890 list=not_in_internet
add address=192.0.2.0/24 comment=RFC6890 list=not_in_internet
add address=198.51.100.0/24 comment=RFC6890 list=not_in_internet
add address=203.0.113.0/24 comment=RFC6890 list=not_in_internet
add address=100.64.0.0/10 comment=RFC6890 list=not_in_internet
add address=240.0.0.0/4 comment=RFC6890 list=not_in_internet
add address=192.88.99.0/24 comment="6to4 relay Anycast [RFC 3068]" list=not_in_internet
add address=255.255.255.255 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="RAW Filtering - RFC6890 reserved" list=bad_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="RAW Filtering - RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_dst_ipv4
/ip firewall raw
add action=drop chain=prerouting comment="Drop DDoS src and dst address list" dst-address-list=ddos-targets src-address-list=ddos-attackers
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=tcp
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=udp
#Required at least in India to reduce call spam/scam#
add action=drop chain=prerouting comment="Drop outgoing SIP to block call centre scammers" port=5060,5061 protocol=tcp
add action=drop chain=prerouting comment="Drop outgoing SIP to block call centre scammers" port=5060,5061 protocol=udp
add action=accept chain=prerouting comment="Enable this rule for transparent mode" disabled=yes
#If you are using DHCP, change this to accept#
add action=drop chain=prerouting comment="defconf: Drop DHCP discover" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="defconf: drop forward to private ranges from WAN" dst-address-list=not_in_internet in-interface-list=WAN
#Remember to properly enter all subnets in the lan_subnet list for both your AS public IPv4 blocks and CGNAT/local subnets#
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address-list=!lan_subnets
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
#Rule for reducing connection tracking impact for public IPv4 customers, we no longer exlucde RFC6890 bound packets as the route to blackhole rules takes care of that#
add action=notrack chain=prerouting comment="Reduce load on conn_track" in-interface-list=LAN src-address-list=public_subnets
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest"
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp
/ip firewall filter
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid
add action=accept chain=input comment="defconf: accept ICMP after RAW" protocol=icmp
#Example to allow access to router's ports from all interfaces LAN/WAN#
add action=accept chain=input comment="Accept Winbox TCP" dst-port=65000 protocol=tcp
add action=accept chain=input comment="Accept API TCP" dst-port=8728 protocol=tcp
add action=accept chain=input comment="Accept API UDP" dst-port=8728 protocol=udp
add action=accept chain=input comment="Accept SNMP for internal use" dst-port=161 protocol=udp
add action=accept chain=input comment="Accept RADIUS UDP" dst-port=1700,1812,1813 protocol=udp
add action=accept chain=input comment="Accept RADIUS TCP" dst-port=1700,1812,1813 protocol=tcp
#End of example#
add action=drop chain=input comment="defconf: drop all not coming from LAN's interface list/subnets" in-interface-list=!LAN
#PPPoE Clients are excluded as to not bypass queues, if using DHCP excluded src and dst address list of customer pool#
add action=fasttrack-connection chain=forward comment="Rule for NAT Accelaration behaviour (Will reduce CPU usage for NATted traffic)" in-interface=!all-ppp out-interface=!all-ppp
add action=accept chain=forward comment="allow already established connections" connection-state=established,related,untracked
add action=jump chain=forward comment="Jump to DDoS detection" connection-state=new in-interface-list=WAN jump-target=detect-ddos
add action=return chain=detect-ddos dst-limit=50,50,src-and-dst-addresses/10s
add action=return chain=detect-ddos dst-limit=50,50,src-and-dst-addresses/10s protocol=tcp tcp-flags=syn,ack
add action=add-dst-to-address-list address-list=ddos-targets address-list-timeout=10m chain=detect-ddos
add action=add-src-to-address-list address-list=ddos-attackers address-list-timeout=10m chain=detect-ddos
#This rule should be redudant as we are now routing RFC6890 to blackhole directly and hence I am commenting it out#
#add action=drop chain=forward comment="Drop tries to reach not public addresses from LAN" dst-address-list=not_in_internet in-interface-list=LAN out-interface-list=WAN#

IPv6 Firewall

I have now added a rule in the raw table to drop header 0, 43 as per this, now the linked article also suggests dropping header 60, but I decided to not drop header 60 for reasons stated in the re-tweet here.

I have now also removed the forward rules completely to improve performance and moved them to the raw table.

/ipv6 firewall address-list
#Enter all the public prefixes that you've routed to this particular BNG#
#We will use this to block spoofed IPv6 coming from customers#
#We will also use this for no-tracking to boost performance of customers having behind the public IPv6 addresses and reduce load on the CPU of the BNG#
#example#
add address=2405:a140:8::/46 comment="CPE-LAN-Pool" list=lan_subnets
add address=2405:a140:c::/54 comment="CPE-WAN-Pool" list=lan_subnets
#To prevent breaking link-local#
add address=fe80::/10 comment="Link-local" list=lan_subnets
#Add your BGP peers here, example below#
add address=2400:7000:1::/126 comment="Peering with Transit on VLAN100" list=bgp_peers
#Copy Paste all the following#
add address=::/3 comment="IPv6 invalids" list=not_in_internet
add address=4000::/3 comment="IPv6 invalids" list=not_in_internet
add address=6000::/3 comment="IPv6 invalids" list=not_in_internet
add address=8000::/3 comment="IPv6 invalids" list=not_in_internet
add address=a000::/3 comment="IPv6 invalids" list=not_in_internet
add address=c000::/3 comment="IPv6 invalids" list=not_in_internet
add address=e000::/4 comment="IPv6 invalids" list=not_in_internet
add address=f000::/5 comment="IPv6 invalids" list=not_in_internet
add address=f800::/6 comment="IPv6 invalids" list=not_in_internet
add address=fc00::/7 comment="IPv6 invalids" list=not_in_internet
add address=fe00::/9 comment="IPv6 invalids" list=not_in_internet
add address=fec0::/10 comment="IPv6 invalids" list=not_in_internet
add address=2001::/23 comment="IPv6 invalids" list=not_in_internet
add address=2001:2::/48 comment="IPv6 invalids" list=not_in_internet
add address=2001:10::/28 comment="IPv6 invalids" list=not_in_internet
add address=2001:db8::/32 comment="IPv6 invalids" list=not_in_internet
add address=2002::/16 comment="IPv6 invalids" list=not_in_internet
add address=3ffe::/16 comment="IPv6 invalids" list=not_in_internet
#We will use this to eliminate the need for stateful firewalling on IPv6 to catch spoofed traffic in the raw table instead of forward chain#
add address=2000::/3 list="global_unicast_prefix(es)"
add address=fd12:672e:6f65:8899::/64 list=allowed
add address=fe80::/16 list=allowed
add address=ff02::/16 comment="multicast" list=allowed
add address=fe80::/10 comment="defconf: RFC6890 Linked-Scoped Unicast" list=no_forward_ipv6
add address=ff00::/8 comment="defconf: multicast" list=no_forward_ipv6
add address=::1/128 comment="defconf: lo" list=bad_ipv6
add address=::ffff:0:0/96 comment="defconf: ipv4-mapped" list=bad_ipv6
add address=::/96 comment="defconf: ipv4 compat" list=bad_ipv6
add address=2001:db8::/32 comment="defconf: documentation" list=bad_ipv6
add address=2001:10::/28 comment="defconf: ORCHID" list=bad_ipv6
add address=2001::/23 comment="defconf: RFC6890" list=bad_ipv6
add address=100::/64 comment="RAW Filtering - RFC6890 Discard-only" list=not_global_ipv6
add address=2001::/32 comment="RAW Filtering - RFC6890 TEREDO" list=not_global_ipv6
add address=2001:2::/48 comment="RAW Filtering - RFC6890 Benchmark" list=not_global_ipv6
add address=fc00::/7 comment="RAW Filtering - RFC6890 Unique-Local" list=not_global_ipv6
add address=::/128 comment="defconf: unspecified" list=bad_dst_ipv6
add address=::/128 comment="RAW Filtering" list=bad_src_ipv6
add address=ff00::/8 comment="RAW Filtering" list=bad_src_ipv6
/ipv6 firewall raw
#New rule to drop deprecated header type 0 & 40#
#For unknwon reasons, this rule doesn't behave as it should, it ends up "accepting" all traffic types and the packet counter goes haywire, I will disable this until MikroTik fixes it#
add action=drop chain=prerouting comment="Drop packets with extension header types 0, 43" headers=hop,route:contains disabled=yes
add action=accept chain=prerouting comment="defconf: RFC4291, section 2.7.1" dst-address=ff02::1:ff00:0/104 icmp-options=135:0-255 protocol=icmpv6 src-address=::/128
#Migrated this rule from the foward chain to make it more CPU efficient#
add action=drop chain=prerouting comment="defconf: rfc4890 drop hop-limit=1" hop-limit=equal:1 in-interface-list=!LAN protocol=icmpv6
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=tcp
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=udp
#I noticed that some traffic were being dropped and caused BGP to flap or misbehave this includes where src IP = your BGP peer but dst IP=link local, which naturally was previously dropped by the rules - Hence we are permitting all kinds of traffic from BGP peers to avoid such problems#
add action=accept chain=prerouting comment="Accept eBGP traffic from peers" in-interface-list=WAN src-address-list=bgp_peers
add action=drop chain=prerouting comment="Drop invalids from WAN" dst-address-list="global_unicast_prefix(es)" in-interface-list=WAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="Drop forwarded invalids from WAN" dst-address-list=not_in_internet in-interface-list=WAN src-address-list="global_unicast_prefix(es)"
add action=drop chain=prerouting comment="Drop invalids from LAN" dst-address-list="global_unicast_prefix(es)" in-interface-list=LAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="Drop forwarded invalids from LAN" dst-address-list=not_in_internet in-interface-list=LAN src-address-list=lan_subnets

#This rule replaces the need for forward chain rule for doing the same thing#
add action=drop chain=prerouting comment="Drop spoofed traffic from LAN going towards Global Unicast" dst-address-list="global_unicast_prefix(es)" in-interface-list=LAN src-address-list=!lan_subnets
add action=accept chain=prerouting comment="defconf: enable for transparent firewall" disabled=yes
add action=drop chain=prerouting comment="defconf: drop bogon IP's" src-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop bogon IP's" dst-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad src ipv6" src-address-list=bad_src_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad dst ipv6" dst-address-list=bad_dst_ipv6
#Not 100% sure, but you likely no longer need this rule as the preceding rules for dropping invalids from WAN should do the job, and hence I will comment this out#
#add action=drop chain=prerouting comment="defconf: drop non global from WAN" src-address-list=not_global_ipv6 in-interface-list=WAN#
add action=accept chain=prerouting comment="defconf: accept local multicast scope" dst-address=ff02::/16
add action=drop chain=prerouting comment="defconf: drop other multicast destinations" dst-address=ff00::/8
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=drop chain=prerouting comment="defconf: drop bad TCP" port=0 protocol=tcp
#Since all filtering for LAN is done in RAW, we do not need to have stateful tracking for LAN, and hence we are notracking all LAN originating/bound traffic after filtering#
add action=notrack chain=output comment="Reduce load on conn_track" out-interface-list=LAN
add action=notrack chain=prerouting comment="Reduce load on conn_track" in-interface-list=LAN
add action=notrack chain=prerouting comment="Reduce load on conn_track" dst-address-list=lan_subnets in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=drop chain=prerouting comment="defconf: drop the rest"
/ipv6 firewall filter
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid
add action=accept chain=input comment="defconf: accept ICMPv6" protocol=icmpv6
add action=accept chain=input comment="defconf: accept UDP traceroute" port=33434-33534 protocol=udp
add action=accept chain=input comment="defconf: accept DHCPv6-Client prefix delegation." dst-port=546 protocol=udp src-address=fe80::/16
#Example to allow access to router's ports from all interfaces LAN/WAN#
add action=accept chain=input comment="Accept Winbox TCP" dst-port=65000 protocol=tcp
add action=accept chain=input comment="Accept API TCP" dst-port=8728 protocol=tcp
add action=accept chain=input comment="Accept API UDP" dst-port=8728 protocol=udp
add action=accept chain=input comment="Accept SNMP for internal use" dst-port=161 protocol=udp
add action=accept chain=input comment="Accept RADIUS UDP" dst-port=1700,1812,1813 protocol=udp
add action=accept chain=input comment="Accept RADIUS TCP" dst-port=1700,1812,1813 protocol=tcp
#End of example#
add action=accept chain=input comment="allow allowed addresses" src-address-list=allowed
add action=drop chain=input comment="defconf: drop everything else not coming from LAN" in-interface-list=!LAN
#All forward rules have been migrated to the RAW table for BNGs, so better performance and no stateful tracking required for customers#

For Edge Router

The purpose of the Edge router is to route as fast as possible. So, with that in mind, along with the basic general changes I’ve mentioned at the beginning of this article, the following should also be kept in mind:

  1. No NAT
  2. No connection tracking aka stateful firewalling (filter table on the firewall section)
    • If you enable stateful firewalling on the edge, the router will die in case of DDoS attacks or even just heavy traffic in general
  3. No fancy “features” (like Hotspot, PPPoE)
    • Use your BNG routers for any customer delegation that is required

BGP Optimisation

This is a work in progress sub-section and at this point in time, I am writing based on my experience with Indian ISPs, so if you’re in the EU/US or other locations, you’re probably already implementing the following:

  • Always route your aggregated prefixes [Like say you have a /24 or /22 (IPv4) or /32 or /36 (IPv6)] to blackhole for IPv4+IPv6 to prevent layer 3 looping and stop disabling synchronisation on RouterOS v6, it is anyways mandatory on RouterOS v7 to either route to blackhole or have the prefix assigned to an interface
    • This will also reduce CPU usage whenever downstream routers/users/switches go offline and incomplete traffic from remote hosts/networks keeps trying to establish a connection and since it gets routed to blackhole it will immediately timeout and save resources.
      • In other words, there’s no sense in doing things that increase CPU usage (not routing to blackhole)
      • And there is no sense in avoiding loop prevention mechanisms
    • Example config on my own network (AS149794) on RouterOS v7
      /ip route
      add blackhole comment="Blackhole route" disabled=no dst-address=103.176.189.0/24


      /ipv6 route
      add blackhole comment="Blackhole Route" disabled=no dst-address=2400:7060::/32
      add blackhole comment="Blackhole Route" disabled=no dst-address=2400:7060::/48
  • If you have multi-homing transit
    • Always at the very least, request for partial routing table from all the upstream providers you’re connected to. If the router can handle full tables from the upstreams, go for it!
      • This will ensure your router has the best paths to choose from
      • Stop going with the strange concept of taking only default routes from the upstreams and creating asymmetric routing conditions where outgoing traffic is going via Transit A and incoming traffic is coming in via Transit B.
    • Always advertise all your IP pools to all transit providers to help minimise asymmetric routing which in turn leads to high latency and possibly packet loss in rare cases
      • If you need traffic engineering, you can consider BGP based load balancing or local preferences
  • If you have a single homing setup
    • Still request for partial table/full table whichever fits your router’s specs in order to futureproof in case you plan to go multi-home

We only need to do broadly two things for filtering and security:

  1. Implement MANRS throughout your network (and business)
  2. Use the RAW table to drop remaining bogon/garbage traffic similar to the one used on the BNG and you can also use it for ACL if you need that
    • CPU usage stays minimal when using the RAW table
    • Absolutely nothing on the filter table i.e. no stateful firewalling

IPv4 Firewall

/ip firewall address-list
#Enter all local subnets/public subnets applicable to your AS, use the full CIDR notation of the public IPv4 block assigned to you to avoid missing anything out, please avoid something like /30#
add address=example_public/24 comment="LAN subnets" list=lan_subnets
add address=example_local_private/24 comment="LAN subnets" list=lan_subnets
add address=0.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=172.16.0.0/12 comment=RFC6890 list=not_in_internet
add address=192.168.0.0/16 comment=RFC6890 list=not_in_internet
add address=10.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=169.254.0.0/16 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=224.0.0.0/4 comment=Multicast list=not_in_internet
add address=198.18.0.0/15 comment=RFC6890 list=not_in_internet
add address=192.0.0.0/24 comment=RFC6890 list=not_in_internet
add address=192.0.2.0/24 comment=RFC6890 list=not_in_internet
add address=198.51.100.0/24 comment=RFC6890 list=not_in_internet
add address=203.0.113.0/24 comment=RFC6890 list=not_in_internet
add address=100.64.0.0/10 comment=RFC6890 list=not_in_internet
add address=240.0.0.0/4 comment=RFC6890 list=not_in_internet
add address=192.88.99.0/24 comment="6to4 relay Anycast [RFC 3068]" list=not_in_internet
add address=255.255.255.255 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="RAW Filtering - RFC6890 reserved" list=bad_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="RAW Filtering - RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_dst_ipv4
/ip firewall raw
add action=accept chain=prerouting comment="Enable this rule for transparent mode" disabled=yes
#If you are using DHCP, change this to accept#
add action=drop chain=prerouting comment="defconf: Drop DHCP discover on LAN" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="defconf: drop forward to private ranges from WAN" dst-address-list=not_in_internet in-interface-list=WAN
#Remember that lan_subnets here should only include your public ranges not CGNAT#
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address-list=!lan_subnets
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest"
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

IPv6 Firewall

/ipv6 firewall address-list
#Enter the aggregated prefixes assigned to your AS that you use for LAN along with link-local fe80::/10#
#example#
add address=2405:a140::/32 comment="AS Prefix" list=lan_subnets
add address=fe80::/10 comment="Link-local" list=lan_subnets
#Add your BGP peers here, example below#
add address=2400:7000:1::/126 comment="Peering with Transit on VLAN100" list=bgp_peers
#Copy Paste all the following#
add address=::/3 comment="IPv6 invalids" list=not_in_internet
add address=4000::/3 comment="IPv6 invalids" list=not_in_internet
add address=6000::/3 comment="IPv6 invalids" list=not_in_internet
add address=8000::/3 comment="IPv6 invalids" list=not_in_internet
add address=a000::/3 comment="IPv6 invalids" list=not_in_internet
add address=c000::/3 comment="IPv6 invalids" list=not_in_internet
add address=e000::/4 comment="IPv6 invalids" list=not_in_internet
add address=f000::/5 comment="IPv6 invalids" list=not_in_internet
add address=f800::/6 comment="IPv6 invalids" list=not_in_internet
add address=fc00::/7 comment="IPv6 invalids" list=not_in_internet
add address=fe00::/9 comment="IPv6 invalids" list=not_in_internet
add address=fec0::/10 comment="IPv6 invalids" list=not_in_internet
add address=2001::/23 comment="IPv6 invalids" list=not_in_internet
add address=2001:2::/48 comment="IPv6 invalids" list=not_in_internet
add address=2001:10::/28 comment="IPv6 invalids" list=not_in_internet
add address=2001:db8::/32 comment="IPv6 invalids" list=not_in_internet
add address=2002::/16 comment="IPv6 invalids" list=not_in_internet
add address=3ffe::/16 comment="IPv6 invalids" list=not_in_internet
add address=2000::/3 list="global_unicast_prefix(es)"
add address=fd12:672e:6f65:8899::/64 list=allowed
add address=fe80::/16 list=allowed
add address=ff02::/16 comment="multicast" list=allowed
add address=fe80::/10 comment="defconf: RFC6890 Linked-Scoped Unicast" list=no_forward_ipv6
add address=ff00::/8 comment="defconf: multicast" list=no_forward_ipv6
add address=::1/128 comment="defconf: lo" list=bad_ipv6
add address=::ffff:0:0/96 comment="defconf: ipv4-mapped" list=bad_ipv6
add address=::/96 comment="defconf: ipv4 compat" list=bad_ipv6
add address=2001:db8::/32 comment="defconf: documentation" list=bad_ipv6
add address=2001:10::/28 comment="defconf: ORCHID" list=bad_ipv6
add address=2001::/23 comment="defconf: RFC6890" list=bad_ipv6
add address=100::/64 comment="RAW Filtering - RFC6890 Discard-only" list=not_global_ipv6
add address=2001::/32 comment="RAW Filtering - RFC6890 TEREDO" list=not_global_ipv6
add address=2001:2::/48 comment="RAW Filtering - RFC6890 Benchmark" list=not_global_ipv6
add address=fc00::/7 comment="RAW Filtering - RFC6890 Unique-Local" list=not_global_ipv6
add address=::/128 comment="defconf: unspecified" list=bad_dst_ipv6
add address=::/128 comment="RAW Filtering" list=bad_src_ipv6
add address=ff00::/8 comment="RAW Filtering" list=bad_src_ipv6
/ipv6 firewall raw
#New rule to drop deprecated header type 0 & 40#
#For unknwon reasons, this rule doesn't behave as it should, it ends up "accepting" all traffic types and the packet counter goes haywire, I will disable this until MikroTik fixes it#
add action=drop chain=prerouting comment="Drop packets with extension header types 0, 43 at network border" headers=hop,route:contains disabled=yes
add action=accept chain=prerouting comment="defconf: RFC4291, section 2.7.1" dst-address=ff02::1:ff00:0/104 icmp-options=135:0-255 protocol=icmpv6 src-address=::/128
#Migrated this rule from the foward chain in BNG to drop these packets on the network edge#
add action=drop chain=prerouting comment="defconf: rfc4890 drop hop-limit=1" hop-limit=equal:1 in-interface-list=!LAN protocol=icmpv6
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=tcp
add action=drop chain=prerouting comment="drop port 25 to prevent spam" port=25 protocol=udp
#I noticed that some traffic were being dropped and caused BGP to flap or misbehave this includes where src IP = your BGP peer but dst IP=link local, which naturally was previously dropped by the rules - Hence we are permitting all kinds of traffic from BGP peers to avoid such problems#
add action=accept chain=prerouting comment="Accept eBGP traffic from peers" in-interface-list=WAN src-address-list=bgp_peers
add action=drop chain=prerouting comment="Drop invalids from WAN" dst-address-list="global_unicast_prefix(es)" in-interface-list=WAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="Drop forwarded invalids from WAN" dst-address-list=not_in_internet in-interface-list=WAN src-address-list="global_unicast_prefix(es)"
add action=drop chain=prerouting comment="Drop invalids from LAN" dst-address-list="global_unicast_prefix(es)" in-interface-list=LAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="Drop forwarded invalids from LAN" dst-address-list=not_in_internet in-interface-list=LAN src-address-list=lan_subnets
add action=accept chain=prerouting comment="defconf: enable for transparent firewall" disabled=yes
#Drop anything from your network going towards the public internet if source addresses does not match your allocated pools#
add action=drop chain=prerouting comment="Drop spoofed traffic from LAN going towards Global Unicast" dst-address-list="global_unicast_prefix(es)" in-interface-list=LAN src-address-list=!lan_subnets
add action=drop chain=prerouting comment="defconf: drop bogon IP's" src-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop bogon IP's" dst-address-list=bad_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad src ipv6" src-address-list=bad_src_ipv6
add action=drop chain=prerouting comment="defconf: drop packets with bad dst ipv6" dst-address-list=bad_dst_ipv6
#Not 100% sure, but you likely no longer need this rule as the preceding rules for dropping invalids from WAN should do the job, and hence I will comment this out#
#add action=drop chain=prerouting comment="defconf: drop non global from WAN" src-address-list=not_global_ipv6 in-interface-list=WAN#
add action=accept chain=prerouting comment="defconf: accept local multicast scope" dst-address=ff02::/16
add action=drop chain=prerouting comment="defconf: drop other multicast destinations" dst-address=ff00::/8
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=drop chain=prerouting comment="defconf: drop bad TCP" port=0 protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=drop chain=prerouting comment="defconf: drop the rest"

Firewall Explanation

I will keep this concise as stated earlier I suggest you study and understand how iptables function in general and study the packet flow to know what rule does what: With that being said, I will break it down into simpler points

  • I used this and this as the source for building the base for the firewall
    • MikroTik has ensured to conform to various RFCs and taken the efforts to not break any legitimate protocol/traffic
  • IPv6 firewall rules are trickier and more complex, but rest assured that the rules in this article do not break any protocol/standard nor do they impact customer’s end-to-end reachability
  • We are dropping spoofed traffic
    • The RAW rules drop anything coming from WAN that’s spoofed (RFC 6890 addresses)
    • The RAW rules drop anything coming from LAN that does not match your public prefixes/internal subnets (aka lan_subnets address list), meaning any spoofing traffic is dropped from exiting your network
    • Here’s an APNIC blog post detailing more on this subject
  • Next, we are dropping bad traffic such as TCP/UDP port 0 or bad TCP flags
  • The filter rules are pretty self-explanatory

Strange Anomalies

These are some strange behaviours that I could not explain. If you have further information, please reach out to me.

  1. NAT Leak
    • For example, let’s say we CGNAT 100.64.0.0/24 to customers with 1.0.0.0/26. Now, it’s common sense that anything WAN bound will have a source IP belonging to the /26 on the other end of the NAT. But nope, this isn’t always the case. What I have observed is, sometimes (meaning all the time if you have thousands of customers), the source IP would be the CGNAT subnet and the destination IP would be public, hence it “escapes” from the NAT engine
    • This behaviour is NOT exclusive to MikroTik. I have observed the same thing on Ubuntu 20.04/Debian based distros, where the source IP is the NAT subnet and it escapes to the WAN interface with the destination IP being real-live public IPs
      • Solution: We just drop anything coming from the BNG that’s not public using the Edge Router, this is already taken care of in my configuration above, you just need to follow the instructions
    • I have been unable to find documentation or bug reports on this behaviour
  2. Netmap vs Src Nat
    • Publicly available documentation suggests simple definitions for both
      Src NAT = 1:Many binding
      Netmap = 1:1 binding
      • But for whatever reason, when using src NAT as the action for a public prefix, it keeps on changing the “NATted” public IP and hence the source IP on the WAN for the customers. This results in traffic breaking or triggering DDoS protection on sites like Cloudflare protected ones
      • And for whatever reason, even though Netmap is meant for 1:1, it works for 1:Many bindings and it does not result in the constant changing of source IP for the customers
    • I have not found any technical information on why these behaviours occur or why netmap even works in the first place for 1:Many bindings
Published inISPsNetworking

53 Comments

  1. Rupam Kumar Sharma Rupam Kumar Sharma

    Such a detailed address of the issues that is so important while in implementation…

    Thanks

  2. Hi Daryll, well done!!!, it was the key for fix a problem I had been triyng to fix in a customer (ISP).
    I would like you could read my problem: https://forum.mikrotik.com/viewtopic.php?f=2&t=176378
    I repeat, That problem its fixed now, thanks of you!.

    I used the following command of your article (with a little modifications):
    /ip firewall nat
    add action=netmap chain=srcnat comment=”NETMAP PPPoE” out-interface=sfp1-Internet src-address-list=Clientes_NAT to-addresses=PUBLIC/32

    I don’t understand what is the difference using “srcnat action masquerade” (witch it wasn’t working) and using “Netmap” (witch for shure it runned perfectly fine at the first moment that I put it). I want to learn/understand why this way is working.

    Thanks a lot.
    Regards from Argentina

    • I hope this helps you.

      Just note that you are missing parameters in your rule, re-check from my article again.

    • For the NAT Leak issue, it mostly looks like speculation on that thread in my opinion. No factual/documented information yet, but I’ll keep an eye on it.

  3. Stefan Müller Stefan Müller

    unfortunately, that is the case. When the conclusion are a bit more solid I will email support anyway, may they resolve the mystery

  4. Good morning,
    could you share what was updated?
    thx 🙂

    • Well, it depends on when you last visited the site? 🙂

      Added:
      QoS/Bandwidth management suggestion, IPv6 for BNGs with PPPoE, IPv6 tweaks, IPv6 firewalling for both Edge and BNGs, slightly tweaked the IPv4 firewall rules for both, MTU section is finalised, CGNAT section is finalised. That’s about it I think.

  5. that is true :), it was end if July.
    I received an notification from WordPress yesterday that a new post was added.
    As there was not any, I guessed it was due to the update of the blog.
    I don’t know if notification are sent as well if the blog is updated

  6. Jeff Jeff

    First off, thank you so much for this! Im currently in the middle of a major network upgrade for our ISP and this post has been absolute gold. Ive learned a ton.

    Anyways, I have a quick question about the Firewall DDoS protection jump chain.

    add action=jump chain=forward comment=”Jump to DDoS detection” connection-state=new in-interface-list=WAN jump-target=detect-ddos

    Why is it only on the forward chain and not the input chain as well? The Mikrotik help page has it this way as well (forward chain only) but Ive been running it on the input and forward chains for a while now. I havent had any issues but im curious if there is there any particular reason why you do not have DDoS on input chain?

    • Input chain means the router itself:
      1. There’s nothing it will do when the DDoS traffic is hitting the router, the link will still choke. You need to have proper DDoS mitigation from/with your upstream

      2. I’ve tested it for fun on the input chain and it ended up breaking traffic that’s destined towards the router itself such as DNS lookups.

      Hence there’s no point in applying those rules in the input chain. They are in the forward chain in order to protect your downstream users.

  7. Jeff Jeff

    Hey Daryll, got another question for you.

    I noticed that when disabling connection tracking on the Edge Router, the Mikrotik puts an auto RAW rule with action=no track for the prerouting and output at the very bottom of the RAW rule table.

    But with the prerouting action = no track being at the bottom, no packets are hitting that rule. So I assume the firewall is still tracking connections as they pass the RAW rules above it.

    The Mikrotik will not allow me to drag that rule to the top, but I can drag my other RAW rules below it and I can then see packets hitting that prerouting rule. But when I reboot the router, it places those rules back at the bottom again?

    Are you seeing the same thing? Or are your prerouting action= no track rules stay at the top? Im on long-term v6.48.6 btw

    • I believe MikroTik does conn_track disable by the means of no tracking via the raw table, but more likely than not, what you’re seeing is a bug. As long as the packets are hitting your other raw rules, it isn’t an issue. Check the connection tracking tab, if there’s nothing there, then we know for sure, connection tracking is disabled. And as long as you’re seeing the expected routing performance throughput, we can also safely assume connection tracking was disabled.

  8. Jeff Jeff

    So I just tried deleting all the rules and making sure I disable the connection tracking before input RAW rule set again. The Mikrotik still places those no track rules at the end of the rule set after reboot again. Is this a bug or is this a misunderstanding on my part?

    • I just tested it out on my personal router running v7.1.1 as I wrote this, I’m unable to replicate what you saw after rebooting. I’m 100% sure it’s a bug.

      I’d suggest a netinstall once with the latest long-term and ensure /system routerboard firmware is also running the latest long-term (rebooted twice for it to work).

  9. Jeff Jeff

    No I definitely have connections there. The reason I rebooted my router was to see if those connections would disappear after moving the rules. Ill submit bug to Mikrotik support Thanks for the clarification

  10. Jeff Jeff

    Yea, its a 6.48.6 bug. Im seeing it on 4 of my CCR1036’s and I duplicated it on my home RB4011. Ill be submitting this to support. Thanks again!

  11. Jeff Jeff

    Question: If you disable connection tracking, is there any real difference between a Filter rule vs a RAW rule? I get wanting to keep all FW rules to an absolute minimum but if connection tracking is disabled, then from a performance perspective, it would be the same other than now a filter rule gives you more flexibilty in terms of being able to block at the input or forward chain where as a RAW rule is more generic.

    For example, if I still wanted to keep an ACL whitelist for input chain to router for security reasons, my rule would look like this.

    /ip firewall filter
    add action=drop chain=input comment=”Drop ALL except from TRUSTED” src-address-list=!TRUSTED

    • First, a caveat, the filter table cannot work without connection tracking, it is by definition stateful and hence needs state tracking.

      Edge/Border routers are not supposed to have connection tracking enabled. They are routers meant to route and forward traffic inter-AS as seamlessly as possible and not filter nor act as a firewall. The most we can do on the edge of a network is drop bogon traffic as we know for a fact, they should never enter a network, to begin with, and serve no functional purpose. (I will update the IPv6 raw table to drop some headers on the edge as per 2022 practices that serve have no functional purpose as well)

      If you enable conn_track on the edge, the performance impact will be visible downstream to the customers and your eBGP router will just randomly reboot once customers saturate the conn_track table. On top of that, you’d be creating a butterfly effect of ugly NAT keep-alive or just keep-alive traffic to now choke not only the BNGs but also the eBGP routers and impact performance even further.

      The raw table gives us the ability to still firewall without the performance impact of stateful tracking. In other words, it’s stateless firewalling.

      So if you want to ACL access to the router, you can still use RAW like:
      /ip firewall raw
      add action=drop chain=prerouting comment=”Drop ALL except from TRUSTED” src-address-list=!TRUSTED dst-address-list=[Your list containing IPs of the interface/router etc]

      However, although this is the most optimal option available on MikroTik, it is not the currently accepted standards or best practices, as the world moves to eBPF/XDP (while MikroTik is playing catch-up for the last 10 years):
      https://blog.cloudflare.com/how-to-drop-10-million-packets/

      You can also find in the above article some data that shows no-track (conn_track disabled) outperforms conn_track enabled.

      At the end of the day, if a high-performance Edge/Border router is what a network needs, it’s certainly something MikroTik cannot deliver at this point in time.

  12. JJT JJT

    Under your IPv6 raw rules, is there supposed to be a !lan_subnets drop rule for Edge Router? I dont see it.

    • Create it if I missed it on the rules. Drop anything that’s not the public prefixes allocated to your network.

      Edit: fe80::/10 should also be a member of lan_subnets to avoid breaking link-local.

  13. Jeff Jeff

    Thats the answer I was looking for. Thank you!

    • I’ve tweaked the !lan raw IPv6 rules. Now it makes more sense and removes the need for forward chain rule on BNG and simplifies it for the edge router.

      However, should IANA ever make changes to the IPv6 blocks, you’d need to update this manually.

      /ip fi raw
      add action=drop chain=prerouting comment="Drop spoofed traffic from LAN going towards Global Unicast" dst-address=2000::/3 in-interface-list=LAN src-address-list=!lan_subnets

  14. Jeff Jeff

    Ahhhh….so to reinforce your point….and regarding that earlier bug we found, I noticed that those FW Raw ‘no track’ rules disappear inside the Raw table when I disabled my lone FW input rule with connection tracking disabled. All makes sense now.

  15. Jeff Jeff

    I see you have updates but its difficult to know what they are and where. Would it be possible to give some kind of changelog and/or highlight improvements/changes you have made?

    • I would need a systematic approach or maybe some WP plugin that can do the job. Do you know any? Writing a manual changelog for documentation this big is too much of a tedious task really.

  16. Riktam Basak Riktam Basak

    can you suggest a budget RADIUS sever other than Radius Manager

  17. Jeff Jeff

    Yea, I hear ya, just throwing the idea out there. The bold explanations do help quite a bit. The BGP Optimization section is a nice addition. I learn more each time I go through it.

    • Help spread the word, and share this article with other network operators and engineers, it benefits the ecosystem if everyone deployed best practices end-to-end.

      If you know somebody who can convert this guide into a Cisco and Juniper equivalent, that’d be great too.

  18. Jeff Jeff

    Absolutely! Myself and another poster (who introduced me to this blog) on r/mikrotik on Reddit as well as the Mikrotik forums take every chance we get to share this with others. Keep up the excellent work. Its very much appreciated!

    BTW, you have some minor typos you might wanna fix when you get a chance:

    The address list “global_unicast_prefix(es)” in your IPv6 raw rule doesnt paste properly in terminal

    add action=drop chain=prerouting comment=”Drop invalids from WAN” dst-address-list=global_unicast_prefix(es) in-interface-list=WAN src-address-list=not_in_internet

    Had to drop the parenthesis inside the address list name to get it to paste in terminally correctly like this “global_unicast_prefixes”

    • MikroTik bug. I think I need to add quotations. I’ll fix it.

  19. Jeff Jeff

    And one last thing:

    I see you’re doing away with the IPv4 ICMP raw filtering. Do you no longer see a benefit to filtering by ICMP types? Also, I do not see any further ICMP accept rules. Is that somehow accepted in the implied “accept everything else from LAN/WAN” rules?

    • Yes. The kernel by default rates limit ICMP/ICMPv6 anyways and hence those rules are redundant and a waste of CPU. All ICMP/ICMPv6 is accepted, let the kernel handle rate limiting.

      Don’t miss the new RFC6890 section 🙂

  20. Jeff Jeff

    Yea I see the RFC6890 blackhole section, I think that part is awesome. I was doing that with my public subnet but using it for the RFC6890 is an excellent idea.

    In regards to ICMP, I get the rate limits but what about the allowed ICMP types? Arent there some deprecated and malicious ICMP types that should not be allowed? Or I guess in this case, you only allow specific ICMP types?

    • Yeah, I will edit the RFC6890 section and say “Inject these rules into any router/L3 switch that has a routing table” – Makes perfect sense if you really think about it.

      There are deprecated ICMP types, yes, but I haven’t seen any hard evidence of them doing any damage if they aren’t blocked, and even if somehow they could, again they are rate limited. So why waste CPU power anyway. As long as everything else is properly configured, the network should be secure.

      You’d need ICMP filtering maybe for DoD/DARPA stuff or something, but eh, not at ISP level in my opinion. I’ve removed all ICMP filtering in my own network and my home routers as well.

  21. Jeff Jeff

    Previously, I couldnt believe how much garbage this rule was collecting.

    #This rule should be redudant as we are now routing RFC6890 to blackhole directly and hence I am commenting it out#
    #add action=drop chain=forward comment=”Drop tries to reach not public addresses from LAN” dst-address-list=not_in_internet in-interface-list=LAN out-interface-list=WAN#

    It was every single router regardless of type of network, there was always tons of garbage. And I couldnt believe there was that much of it, everywhere. My guess is random misconfigurations and/or crap device code.

    I then implemented the blackhole routes and WOW….its mostly all gone now.

    And I say “mostly” because there’s one small caveat I noticed. One of my sites has a failover which is a double NAT through another provider. With the failover WAN IP being in the 192.168.x.x subnet, bandit packets are still hitting the above rule. Which makes sense if you think about it. Its a minor issue, not that big a deal, especially on a site of its kind. But thought it was worth mentioning.

    • It’s not just misconfiguration. For unknown reasons my personal Windows 11, Debian Based, iOS, macOS devices all originate such packets. I never found an explanation.

      That’s expected behaviour in your specific site:
      More specific route is always preferred over less specific route. I’d leave it be, not much harm that could happen there.

  22. Anav Anav

    Hi Daryll. In terms of netmap. The way I understand it in my laymens terms is that if one has a subnet of fixed public IPs being netmapped to a larger group of private IPs, what happens is what I call a slice or jump pattern of assignment. Initially I thought Okay for a 256 block of public iPs, the first private 256 private IPs are assigned tot he first public IP. Wrong, Its the 1,257,513 etc private IPs that get assigned to the public IP. So its fair to say that the same block of private IPs (via slices or jumps) always gets the same public IP. Hope that helps.

    • I already knew the netmaps ensure 1:1 mapping, i.e 1.1.1.1 netmapped to 100.64.0.7 will persistently stay the same until reboot or similar. Which is perfect for P2P/STUN/ICE/WebRTC/TURN. But the question is: Why does netmap public/24 works with private/8 for example? The Linux Man page suggests it shouldn’t.

      Edit: Wait a minute, this is “Anav” from the MikroTik forums, ain’t it? I’m leaving just going to leave this here.

  23. Johan Johan

    Hello
    Thats a great work
    I have a question, what is the real purpose of loose tcp tracking?
    Is it other tracking with the original connection track ?

    • Loose tracking = yes means don’t pick up already established connections twice (or more). Saving CPU and resources.

  24. Jeff Jeff

    Question regarding IPv6:

    The biggest reason I have yet to deploy it yet is due to Mikrotik’s limitation in being able to simultaneously queue both IPv4 and IPv6.

    How are you doing it? Is it easier since most ISP’s in India use PPPOE?

    I just ran across the below from one of the big Mikrotik consultants using RADIUS via DHCP.

    https://stubarea51.net/2022/03/30/webinar-deploying-ipv6-for-wisps-and-fisps/

    • With PPPoE, it is easier. But I think you would need to give persistent IPv6 PD assignments (which you should be doing anyways), and then Queue on a per prefix basis where a customer is behind each of them.

      But if you’re going the DHCPv6 route – With Tik, there’s a problem. It can only hand out PD, but not addresses. Which means the customer will receive a /56 or shorter prefixes for LAN, but their WAN (Link prefix) will be null, unless you use a /64 on a per interface basis with SLAAC and configure the CPE to pick it up via SLAAC for WAN. But even then you’ll have a problem. SLAAC in Tik is not managed via RADIUS – So you won’t know which customer was assigned which address and so on.

      I’d suggest talking with stubarea51 consulting firm and let me know if you find a solution. I’ll add it to my guide.

      Matter of fact if you’re already doing DHCPv4, let me know the whole procedure (via emails), I’ll add that to my guide too – Like how did you set up DHCP Option 82, MAC binding, security. Did you use VRFs maybe? To repeat RFC1918 for different VLANs etc?

  25. Steven Steven

    Hi Daryll
    Thanks for this article
    I have an idea about routing loop, What about add RFC6890 in routing rules with lookup only table, ip rout here only take this table to blackhole ?

    • The whole point is to route less specifics to blackhole. Which is applicable to both RFC680 blocks and also public pools.

      What is lookup supposed to serve? I don’t see the need to possibly (if I understood you) create a blackhole only table?

  26. Steven Steven

    Here is an example

    /routing table
    add disabled=no fib name=Blackhole
    /routing rule
    add action=lookup-only-in-table disabled=no dst-address=192.168.0.0/16 table=Blackhole
    add action=lookup-only-in-table disabled=no dst-address=10.0.0.0/8 table=Blackhole
    add action=lookup-only-in-table disabled=no dst-address=172.16.0.0/12 table=Blackhole
    add action=lookup-only-in-table disabled=no dst-address=255.255.255.255/32 interface=BNG table=Blackhole
    /ip route
    add blackhole comment=Blackhole disabled=no distance=1 dst-address=0.0.0.0/0 gateway=”” pref-src=”” routing-table=Blackhole scope=30 suppress-hw-offload=no \
    target-scope=10

    • I don’t see any reason to use a separate table. If anything this would probably increase CPU usage as now it has to manually lookup for each subnet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.