Border Gateway Protocol (BGP) is the interdomain routing protocol for the internet for routing between Autonomous Systems (AS). In other words, BGP is an exterior gateway protocol (EGP), whereas OSPF, EIGRP and RIP are interior gateway protocols (IGP). Therefore, BGP differs from the IGPs in several aspects.
BGP is a “Path-Vector” routing protocol. (RIP – distance vector, OSPF – link state). BGP is classless and hence supports VLSM. But, in comparison to OSPF and EIGRP, BGP has a slow convergence.
BGP is extensively used in very large networks. BGP can also be used in multi-vendor environment. The current version of BGP is version 4. BGP is used for routing between different Autonomous Systems (AS). AS can be described as a group of networks under a single administrative control.
Examples for an AS include Internet Service Provider (ISP) or a Large Enterprise Organization. BGP AS are numbered as standardized by IANA. The AS range is from 0 to 65535. BGP uses TCP for reliable transfer of its packets on port 179.
In our lab topology, we have implemented BGP within a single AS. All the routers R1, R2, R3 and R4 belong to a single AS (65535). In BGP too, the routers need to establish neighbor relationships. BGP enabled routers are also known as BGP speakers and when they establish neighbor relationships, they are called as BGP Peers. In BGP, it is mandatory for all the routers within an AS to be connected in full-mesh.
If the BGP speakers belong to different ASs and they form neighbor relationship, then are called as “eBGP peers”. In our lab topology, since all the routers lie within the same AS, they are known as “iBGP peers”. After the peering is established, the BGP peers exchange full routing table with each other. However, only triggered updates are sent later.
In our lab topology, R1 has peering with R2, R3 and R4; R2 has peering with R1, R3 and R4; R3 has peering with R1, R2 and R4; and R4 has peering with R1, R2 and R3. For this peering to happen, BGP uses TCP connection on port 179 to exchange series of messages. Also, the peering session goes through several states till it is fully established. Let’s consider peering between R1(220.127.116.11) and R2(18.104.22.168) to understanding the BGP peers and the peer session.
The first state of the BGP peer session is “IDLE”. It then goes from “IDLE” to “CONNECT” when the first “SYN” is sent. Here if the TCP connection from the remote peer is successful, it sends “OPEN” message otherwise it goes in “ACTIVE” state after the “ConnectRetry” timer has expired. A successful TCP connection is characterized by a “3-Way Handshake” (SYN, SYN-ACK, ACK). Considering our lab topology, R1 initiates the TCP connection by sending “SYN” to R2 on dst-port 179.
R2 responds back by sending “SYN-ACK” to R1 on its dst-port 179. The TCP connection is established in this third step when R1 sends an “ACK” for the “SYN-ACK” to R2. In “ACTIVE” state, the BGP attempts to establish the TCP connection with the remote peer. If successful “OPEN” message is sent. If not successful, the session goes back to “CONNECT” after the “ConnectRetry” timer has expired.
In other words, till the TCP connection is successful, it keeps flapping between the “CONNECT” and the “ACTIVE” state. After the TCP connection is established, the peer sends the “OPEN” message and this is the “OPENSENT” state. In this BGP peer who has sent the “OPEN” message waits for the “OPEN” reply message. When it gets the “OPEN” reply message, it sends the “KEEPALIVE” message. In “OPENCONFIRM” state, the BGP peer waits for reply for the “KEEPALIVE” message. Once it is done, BGP goes to “ESTABLISHED” state.
“UPDATE” message which contains routing information are sent/ exchanged between BGP Peers only after the BGP is ESTABLISHED. These messages can be observed in detail by running cli command “debug ip bgp all”. BGP messages include: OPEN, KEEPALIVE, UPDATE and NOTIFICATION. BGP “OPEN” message is sent between BGP peers to establish / initiate the BGP session. The “OPEN” message contains parameters like BGP Version (4), My AS: 65535, Hold Time:180 and BGP Indentifier:22.214.171.124 (from wireshark captures wherein source is R1 and destination is R2).
All the parameters mentioned are same when “OPEN” message is sent from R2 to R1, except BGP Identifier (126.96.36.199). “KEEPALIVE” messages are sent between peers to detect if a peer is still available or dead. The default interval for “KEEPALIVE” messages is 60 seconds. However, to consider a peer to be dead, the BGP peer must wait for a certain amount of time to receive the “KEEPALIVE” messages. If no “KEEPALIVE” messages are received during that wait period, the peer is considered as dead. This time interval is known as the “HOLD-TIME” and its value is 180 seconds (default).
“UPDATE” messages are used to exchange routing information between the BGP Peers after the peering has been established. For instance, wireshark captures between R1 and R2 shows the networks learnt through the “UPDATE” message. R1 to R2 “Update” messages show contents “Network Layer Reachability Information” (NLRI) as 188.8.131.52/32 and 184.108.40.206/32. Whereas “Update” message from R2 to R1 shows contents of NLRI as 220.127.116.11/32.
“NOTIFICATION” messages are sent when there is an error detected. It causes the BGP session to reset. BGP has an Administrative Distance (AD) of 20 for eBGP routes whereas AD of 200 for iBGP routes. This can be checked using the cli command “show ip protocols”. Also, summarization is disabled by default in BGP.
The TTL value for BGP peers within the same AS is 255 (can be seen from wireshark captures), whereas TTL value between eBGP peers is only 1 (default). This can be changed if required. Unlike OSPF and RIP, BGP does not have a specific metric like cost and hop count respectively. It utilizes several attributes to determine the optimal path to a destination network.
There are several sub-categories of the attributes which include: Well-known Mandatory, Well-known Discretionary, Optional Transitive and Optional Non-Transitive. Well-known Mandatory attributes are those which are standard for all BGP implementations and hence they are used in each and every BGP update sent. Well-known Mandatory BGP attributes include: “ORIGIN”, “AS_PATH”, “NEXT_HOP”. “ORIGIN” identifies the originator of the route. It gives information stating the origin is internal to the AS or not.
The Code for “ORIGIN” is 1. “AS_PATH” gives information about all the Ass traversed to reach a destination. As each AS that propagates a route prepends its own AS number. This helps in preventing loops. The additional information it offers is the path length (no. of Ass traversed).
By default, the shortest path is preferred. The Code for “AS_PATH” is 2. Any router sending an advertisement includes its own address as the “NEXT_HOP”. This is used by receiving router to build its routing table.
The Code for “NEXT_HOP” is 3. Well-known Discretionary attributes are those which are standard in BGP implementations but used as optional in BGP updates. Optional Transitive attributes are those which are not standard for in all BGP implementations. However, these attributes will be sent unaltered by the non-standard BGP router when sending updates. Optional Non-Transitive are those which are not standard for in all BGP implementations. However, these attributes will be removed by the non-standard BGP router when sending updates.
BGP can select the best route if there are multiple routes available for the same destination network. This selection criteria are based on a 11-STEP Process. BGP compares the attributes of the route is a specific order to do so. The order is: Weight, Local Preference, Locally Originated, AS Path, Origin Code, Multi-Exit-Discriminator (MED), BGP Route Type (eBGP or iBGP), Age, Router ID, Peer IP Address. This lab helped us learning BGP with detail insights about the concept of an AS, eBGP and iBGP peering, use of IGP in iBGP, BGP neighbor relationships, BGP messages, BGP attributes and their importance in selecting the best route.