Saturday, 11 March 2017

Setup and Troubleshooting of IPSec VPN between AWS and Juniper SRX Firewall

Setting up IPSec VPNs in AWS is pretty simple - virtually all the work is done for you and they even provide you with a config template to blow onto your device. There are only a couple of points to remember while doing this to make sure you get a good, working VPN at the end - in this post I'll quickly show the setup and how to troubleshoot some of the more likely snags that you could run into.

Setup - AWS End


To set up an IPSec VPN into an AWS VPC you require 3 main components - the Virtual Private Gateway (VPG), the Customer Gateway (CG) and the actual VPN connection.


The VPG is is just a named device, like an IGW. Create a VPG and name it.


Attach the VPG to your VPC so that it can be used.

Next we need to create a Customer Gateway (CG) profile:


This defines the parameters of the opposite end of the tunnel (i.e. our SRX firewall), most key being the IP address. For our simple case we'll just use static routing but BGP is also an option.


Next we create a VPN connection profile:


The VPN connection profile basically ties the other two objects together and defines the IP prefix(es) that will be tunnelled over IPSec to the other end.

Once this is created you can download configuration templates for various device types, in our case we want Juniper ERX:


At this point the AWS VPN configuration is basically complete. Download the configuration template and open it in something which handles UNIX style end of line markers (i.e. Notepad++, Wordpad) ready to configure the firewall end.

Setup - Juniper SRX End


Assuming some sort of working basebuild, the Juniper SRX configuration is almost a straight copy and paste from the configuration templates. There are a couple of key exceptions:

  • IKE interface binding (lines 54 & 173 at time of writing) - you should override this with the "outside" interface of your firewall. For xDSL this will probably be pp0.0, for Ethernet based devices it could be fe-x/x/x.0 or vlan.x
  • Routing (lines 134 & 253 at time of writing) - the config template does not contain the actual routes you will need, or even a sensible default such as 172.31.0.0/16 to cover the default VPC.
  • It's probably worth un-commenting the traceoptions lines to give some debugging output in the event of tunnel problems.
Once the template is applied you may have the desired connectivity, if not then read on...

Troubleshooting


Firstly, we need to check phase 1 of the VPN (IKE) is up:

root@Lab-SRX> show security ike security-associations
Index   State  Initiator cookie  Responder cookie  Mode           Remote Address
4862528 UP     53a352fbe8fbf11a  26d9edf2e3a2d371  Main           52.56.146.67
4862529 UP     901117dbc101ce98  a1c21584e8cd22e2  Main           52.56.194.28


This shouldn't be a problem as the template basically takes care of all the proposals and whatnot being correct. If there aren't 2 SAs in an UP state then check you put the right IP address into the AWS Customer Gateway configuration.


Next, we check IPSec is up:

root@Lab-SRX> show security ipsec security-associations
  Total active tunnels: 2
  ID    Algorithm       SPI      Life:sec/kb  Mon lsys Port  Gateway
  <131073 ESP:aes-cbc-128/sha1 49d38075 3543/ unlim U root 500 52.56.146.67
  >131073 ESP:aes-cbc-128/sha1 b3b5474b 3543/ unlim U root 500 52.56.146.67
  <131074 ESP:aes-cbc-128/sha1 4df0b3b 3543/ unlim U root 500 52.56.194.28
  >131074 ESP:aes-cbc-128/sha1 2e1e40aa 3543/ unlim U root 500 52.56.194.28


This should show two tunnels in each direction (direction denoted by the "<" and ">"). Again, very little is likely to go wrong here as the template should cover everything.

Assuming that's good, we would now check IPSec statistics:

root@Lab-SRX> show security ipsec statistics
ESP Statistics:
  Encrypted bytes:             5472
  Decrypted bytes:             3024
  Encrypted packets:             36
  Decrypted packets:             36
AH Statistics:
  Input bytes:                    0
  Output bytes:                   0
  Input packets:                  0
  Output packets:                 0
Errors:
  AH authentication failures: 0, Replay errors: 0
  ESP authentication failures: 0, ESP decryption failures: 0
  Bad headers: 0, Bad trailers: 0

root@Lab-SRX>



Ideally we want to see both encrypted and decrypted packets - if one way isn't working then probably the (would be) sender is at fault. Verify that the configuration template was fully applied.

Next we check the secure tunnel interface statistics - a good idea is to ping other end of the tunnel to see if the counters increase:

root@Lab-SRX> show interfaces st0 | match packets
    Input packets : 32
    Output packets: 0
    Input packets : 32
    Output packets: 0

root@Lab-SRX> ping 169.254.66.229
PING 169.254.66.229 (169.254.66.229): 56 data bytes
64 bytes from 169.254.66.229: icmp_seq=0 ttl=254 time=12.627 ms
64 bytes from 169.254.66.229: icmp_seq=1 ttl=254 time=12.342 ms
64 bytes from 169.254.66.229: icmp_seq=2 ttl=254 time=12.169 ms
64 bytes from 169.254.66.229: icmp_seq=3 ttl=254 time=12.314 ms
^C
--- 169.254.66.229 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 12.169/12.363/12.627/0.166 ms

root@Lab-SRX> show interfaces st0 | match packets
    Input packets : 36
    Output packets: 4
    Input packets : 32
    Output packets: 0

root@Lab-SRX>

A working ping to the other end with counters incrementing really indicates that the tunnel is formed OK and able to carry traffic. If this works but "real" traffic doesn't then there is most likely some basic configuration missing:


Check Intra-zone Traffic Permitted

By default you can't pass traffic between interfaces of the same zone on the SRX. It's common not to have more than one routed interface in a zone so this is easily overlooked. Just add it as follows:

root@Lab-SRX# set security policies from-zone trust to-zone trust policy allow-all match source-address any
root@Lab-SRX# set security policies from-zone trust to-zone trust policy allow-all match destination-address any
root@Lab-SRX# set security policies from-zone trust to-zone trust policy allow-all match application any
root@Lab-SRX# set security policies from-zone trust to-zone trust policy allow-all then permit
root@Lab-SRX# commit

You should now see your "real" traffic causing the VPN statistics to increment, even if the hosts at each end cannot communicate with one another.

Check AWS Routing Table

One thing that is easily forgotten when creating a new VGW is that in order to use it, a route entry must exist for the subnet sending traffic via the VGW. This needs to be created manually:


Simply edit the routing table(s) applied to your network(s) and set the next hop for your tunnelled networks to be the VPG appliance. At this point you may find that traffic from AWS towards the SRX works but in the opposite direction it does not...

Check AWS Security Group


If at this stage you have one-way connectivity then almost certainly all you need to do is to allow the VPN range inbound on your security group(s). Remember that VPC security groups are stateful and all outbound traffic (and its replies) is allowed by default.

If required, simply add rules allowing the appropriate traffic from the IP block that is tunnelled back to the SRX. In this case to keep it simple we just allow open access:


If it still doesn't work, rollback the SRX config, blow away all the elements of the VPN and start again!