Pages

Tuesday, December 20, 2011

Site-to-Site VPN between Check Point and Cisco ASA

It's a common occurance that we have to configure Site-to-Site VPNs between Check Point firewalls and Cisco devices (ASAs and routers).
But configuring a Site-to-Site VPN in Check Point with a 3rd Party Device is sometimes a bit tricky. This is because the auto-summarisation or supernetting of networks in Check Point end.
When we are creating the Site-to-Site VPN we have to follow these basic steps.
  • Enable VPN feature on the Check Point firewall
    • Configure the Encryption Domain
  • Create an Interoperable device for the remote end VPN terminator
    • Configure the Encryption Domain
  • Create a VPN Community and configure the parameters for the VPN (IKE, IPSec parameters)
  • Configure Firewall rules for the communication
After configuring all these things correctly we will (most of the time) fall into the famous
"No Valid SA when creating IPSEC tunnel with an interoperable device" problem.

According to sk39419:

"The nature of this problem is due to the ability of the Check Point Security Gateway to dynamically supernet subnets to reduce the amount of SA overhead normally generated by VPN traffic. Most third party vendors are inherently static and therefore do not have the ability to understand this dynamic behaviour."

This article also provides 3 possible solutions for this.

But if you just play around with this you will find an interesting behaviour.

This is what I observed when I played with one of the VPNs I had to troubleshoot.

Our Encryption Domain (behind the Check Point Firewall) is a straight forward 10.16.0.0/24 network.
Remote end Encryption Domain (behind Cisco ASA) had three hosts 192.168.1.240, 192.168.1.241. So if I create two host objects for the above two IPs, add them to a group object and configure that group object to be the Encryption Domain of the Interoperable Device, this is what happens.
IKE Main mode completes correctly.
In the Quick mode negotiation the ids are sent as (10.16.0.0/24 - 192.168.0.0/16) from Check Point to ASA.
As the ASA is not configured to accept 192.168.0.0/16 as the id it will not establish the IPSec tunnel.
On solution we can provide (also mentioned in sk39419) is to set "One VPN Tunnel per each subnet pair" option in VPN Commmunity -> Tunnel Management section.
The problem with this is it will increase the number of phase 2 SAs. This is not desired though it solves the problem.

So luckily we can supernet the IPs used for the two hosts to 192.168.1.240/31 (thus following the solution C in sk39419). So we created a Network object (say Net_remote_enc_domain) with IP address 192.168.1.240 and net mask 255.255.255.253 and added it to the encryption domain of the Interoperable device.
It worked.!!! :)

But later, there was a requirement to add an additional host to remote end's encryption domain - 192.168.1.218.
If we are to have a supernet to the three hosts, eventually we will be adding 64 hosts to the encryption domain. This was not desired from the remote end.
Still we are following the solution C of sk39419:
"Change the encryption domain of the Security Gateway to use a specific subnet instead of using hosts or groups. This will stop the Check Point Security Gateway from supernetting hosts since they now are part of a subnet which has been manually defined."

Now I wanted to play around a little bit.
First I created a group object (grp_remote_enc_domain), added the network object Net_remote_enc_domain (which is 192.168.1.240/31) and host object (say remote_host3) with IP 192.168.1.218, and configure it as the encryption domain to the Interoperable device (i.e. the remote end VPN terminator)

Note: We were informed that there are two ACLs created in the ASA end matching 192.168.1.240/31 and 192.168.1.218/32

I enabled ike debugging in out Check Point firewall (using vpn debug ikeon) and initiated traffic from our end.
First to 192.168.1.240, then to 192.168.1.241 and finally to 192.168.1.218.
The first two communications were successfull, but the final one failed.
When I examined the ike.elg using IKEView, I observed the following phase 2 IDs being exchanged
(10.16.0.0/24 - 192.168.1.240/31) - completing Quick mode
(10.16.0.0/24 - 192.168.0.0/16) - Quick mode failed

So I came to a conclusion that when ever a host object is encountered in the Encryption Domain it Check Point will try to summarise or supernet the IP address.
So what I did was instead of the host object, created a Network object with /32 prefix length.
Net_remote_host2 (with IP address 192.168.1.218 and net mask 255.255.255.255).
So now grp_remote_enc_domain group object has two members:
 - Net_remote_enc_dmoain
 - Net_remote_host2

Saved the policy and installed it in the gateway.
Now when I initiate the traffic (with ike debug on) it WORKED!!!!!!! for the three remote hosts.
When I examined the logs there were two ids negotiated (thus 2 SAs created).
(10.16.0.0/24 - 192.168.1.240/31) for 192.168.1.240 and 192.168.1.241 hosts
(10.16.0.0/24 - 192.168.1.218/32) for 192.168.1.218 host.

So I didn't have to go with editing objects_5_0.C and change the variable ike_use_largest_possible_subnets to false
or
Configure the "max_subnet_for_range" table in $FWDIR/lib/user.def on the management console (SmartCenter)

We don't like to have the above two options because will be not visible in the Dashboard and eventually if it is not well documented the future firewall admins will loose track of it.

Tuesday, August 2, 2011

Sudden drop in traffic through some interfaces in Power-1 (solved)

Recently we encounter a strange behaviour in one of the Power-1 clusters deployed in one of our telco customers. A brief explanation of the problem we encountered is as follows.
AAA traffic was entering from the external interface and was designated to a RADIUS server which lies in another interface. In the traffic graphs they have experienced a sudden drop of traffic and all the authentication requests are lost in that instance. After a couple of seconds the traffic is back to normal. This happens not only in the peak hours but also in other times too. Even if they switched from one Power-1 device to the other the problem remained.
After digging down into the problem we came up with a solution.
As we suggested there are interface drops recorded. You can get an idea of the Tx/Rx errors and drops by issuing "ifconfig "
So when we issue this command to the relevant interfaces we noticed that there are huge amount of Rx drops in the external interface.
From this we can come to a conclusion that Rx buffer is not sufficient. To better understand about the problem it's always better understand what the Rx buffer is.
When the NIC receives packets it issues an interrupt to the CPU to handle the packet. For each packet it receives it generates an interrupt. So when the CPU is interrupted the CPU handles the packet first by executing the relevant interrupt procedure and then handling the packet with the relevant software component (in this case the Check Point firewall kernel). But the CPU cannot handle the packets at the rate the NIC receives them. So the NIC needs some sort of a temporary storage location. So the NIC is allocated some temporary storage (buffer) from the RAM. This is the same for the Tx buffer.
You can always view the allocated Tx and Rx buffers to an interface by issuing "ethtool -g ". By issuing this command you can see the maximum values as well as the current allocated value.
Now we know what the Rx buffer really is. So the buffer gets filled when the CPU is taking too much time to process traffic. Does Check Point provides a solution for increasing its performance, i.e. both throughput and connection rate. Well it does. It is the SecureXL technology. For SPLAT it is provided with the Performance Pack (another module that is loaded). So we can speed up the packet handling if we tune up SecureXL.
So we analysed the SecureXL stats as well. You can get the details from "fwaccel stats" or else you can view them from "/proc/ppk/statistics". So as doubted the f2f (non accelerated traffic) was higher than accelerated packets. So we tried to optimize this by modifying the rule base. So after some effort put on the rule base, we could get the stats to an acceptable value.
Still the problem remained. Then we moved on to the next step of increasing the buffer memory.
In this case the maximum was 4096kB. We increased the value in 1024kB increments. Until we solved the problem.
As expected it solved the problem.

Tuesday, March 15, 2011

Standalone installation of Eventia Suite

We have to notice that for Check Point R70.30 Eventia Suite is a must. So even though you have not installed the Eventia Suite when you are installing R70, when you install the upgrade package R70.30 the Eventia Suite is automatically installed.
Now for R70 standalone installation of Eventia Suite is not supported. What we mean standalone is Management Server (SmartCenter Server) and Eventia Suite installed in the same machine. What Check Point mean by not supported here is that you can install the two components in a single host but the configuration in the Dashboard is not allowed.
So is it really not supported? Can't you use the Eventia Suite when it is installed along with the Management Server (primary management server).
The fact is you can. I searched a lot in various forums even in the Check Point Usercenter, but couldn't find anything regarding this issue.
But the solution is simple.
First of all what you need is a valid license :). Once you have installed a valid license you can proceed.
Second what you need to make sure is whether the appropriate servers are running. For this you can issue "evconfig" and enable the necessary components. After you have enabled the necessary components, restart the Eventia Suite by issuing "evstop;evstart".
Now login to SmartDashboard and edit the Management Server object.
In the General view of the management server object you can see the Management Software blades enabled for your management server. But you will notice that all the components related to Eventia products are greyed out, meaning you cannot enable them. Unless you enable them in the management object, you cannot connect to the management server using a Eventia client.
So we are kind of stuck here.......
But Check Point allows you to manually manipulate its object database using dbedit. What we can do is use the gui version of the dbedit, which is "guidbedit". This is located in the SmartConsole installation directory.
Launch GuiDbedit, and provide the credentials to login to the management server.
Go to network_objects and locate the management server object.

Friday, February 25, 2011

Reverting back from R71 to R70

Recently I went on upgrading a Check Point Management Server having R70.30 installed to R71. I downloaded the R71 upgrade package from the Check Point site. The package was for upgrading from web ui.
So I used the Web UI to upgrade the management server. During the upgrade process I created a snapshot image (as mentioned in the upgrade process window).
According to the Web UI, the upgrade was successful.
So after the management server was upgraded, I upgraded it to R71.20.
After the upgrade process I couldn't event log into the Web UI.
When going through the fwm.elg it mentioned that fwm process didn't start.
I will explain the details about this specific issue in another post. In this post I will emphasize on how to revert back the snapshot image created in the R70.30 environment, in R71 environment.
So when I tried to revert the created snapshot image (pre_upgrade_snapshot.tgz) it failed. I tried numerous times but the result was the same. This is due to the fact that the snapshot image and the current running version have dissimilar major releases (R70 and R71).
"revert"ing does not support between two major releases.
This true if you try do it on the boot-up process, with snapshot image management.
So in this kind of a situation the easiest way to revert back to the earlier snapshot is as follows.
  • Uninstall all the new packages related to R71
    - issue "rpm -qa | grep R71"
    - this will give you all the rpms installed for R71
    - using "rpm -e" uninstall all the related packages for R71
  • The use the "revert" command and select the previous snapshot image
This will restore your management server back to R70 (in my case it was R70.30).