Problem: IPSEC VPN Down
At 2:44am CT the primary 10Mbps IPSEC VPN went down, but the 3Mbps MPLS worked flawlessly after route reconvergence. As the day progressed, the level of traffic between the two sites increased and began causing performance problems for users at Site B.
As we continued to troubleshoot what had happened, we found this syslog entry in Splunk that came from FW A:
Oct 14 02:44:33 fw.fw.fw.21 Oct 14 2010 02:44:33: %ASA-4-106023: Deny protocol 47 src inside:a.a.a.1 dst outside:b.b.b.254 by access-group "inside_access_in"
(Note: IP addresses have been changed here for security reasons.)
Nobody had made any changes at 2:44am. So what changed? After digging some more into our change management system, we found this change to FW A that was made back on 9/23:
Last Month - 9/23/2010 12:00:18 AM
ADDS 0, DELETES 0, CHANGES 1
access-list inside_access_in extended permit gre host a.a.a.1 host b.b.b.254
access-list inside_access_in extended permit gre host a.a.a.254 host b.b.b.254
- Oct 14 02:44:26 fw.fw.fw.21 Oct 14 2010 02:44:26: %ASA-3-713123: Group = [FW B InternetIP], IP = [FW B InternetIP], IKE lost contact with remote peer, deleting connection (keepalive type: DPD)
- Oct 14 02:44:26 fw.fw.fw.21 Oct 14 2010 02:44:26: %ASA-5-713259: Group = [FW B InternetIP], IP = [FW B InternetIP], Session is being torn down. Reason: Lost Service
- Oct 14 02:44:26 fw.fw.fw.21 Oct 14 2010 02:44:26: %ASA-4-113019: Group = [FW B InternetIP], Username = [FW B InternetIP], IP = [FW B InternetIP], Session disconnected. Session Type: IPsec, Duration: 21d 15h:00m:15s, Bytes xmt: 181785169, Bytes rcv: 3049561298, Reason: Lost Service