Understanding AWS VPC Flow Logs

VPC Flow Logs are a useful tool for monitoring the security of your AWS Virtual Private Cloud. But understanding and getting the most from these logs can be a bit tricky.

Why Use VPC Flow Logs?

VPC Flow Logs track all inbound and outbound traffic to and from instances in your Amazon Web Services Virtual Private Cloud.  They track both traffic that is accepted by Security Groups and Network Access Control Lists, and also traffic that is rejected.

They are critical for investigating a security incident after the fact, but can also be used to trigger an alert of suspicious activity as it happens.

In this article, we will show you how to set up VPC Flow logs and then leverage them to enhance your network monitoring and security.

How to Enable VPC Flow Logs

First, go the VPC section of the AWS Console.  Select your VPC, click the Flow Logs tab, and then click Create Flow Log.

create-vpc-flow-log-1

The next screen is a wizard to help you set up flow logs.  You can choose to collect accepted and/or rejected traffic.  Some people prefer one log for accepted and another for rejected.  I prefer both types of traffic in the same log.  The next step is to select an IAM role to allow flow logs to be published.  The easiest create the role is to click the “Set Up Permissions” link.  Finally, you need to select a Destination Log Group in Cloudwatch.  I recommend a name of “FlowLogs.”

create-vpc-flow-log-2

If you clicked “Set Up Permissions,” you will see an IAM wizard as shown below.  Let it create a new IAM role for you.  Give the role a name that will help you remember its purpose such as “FlowLogsRole.”

create-iam-role-flow-logs

Viewing VPC Flow Logs

To view your flow logs, go to AWS CloudWatch, and then select “Logs” on the left hand side of the screen.  This will give you a list of your log groups.  Select your FlowLogs group (or whatever group name you provided when you set up  VPC Flow Logs.

vpc-flow-logs-cloudwatch

The logs are grouped according to the Elastic Network Interface (ENI) attached to your EC2 instance or Elastic Load Balancer (ELB).  To find your EC2 instance’s ENI, go to EC2, select your instance, then on the description tab, find the network interfaces and click on the link (probably eth0) as shown below.  The interface ID is what you need to find the correct log within your Flow Logs.

how-to-find-ec2-elastic-network-interface

Back in your VPC Flow Logs you can search for the logs related to this network interface to see all accepted and rejected traffic.

Filtering and Understanding VPC Flow Logs

Usually, you are not interested in wading through all of the accepted and rejected traffic for your EC2 instance.  You are likely interested in a particular subset of that traffic.  That may be all rejected traffic, all traffic to or from a specific address or using a specific port.  To find that traffic, you can use filtering.

To filter traffic, start by pasting the text below into the filter field.  The text below does not filter anything, but we will see how to filter next.

[version, accountid, interfaceid, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, logstatus]

To begin filtering, simply add =value to one or more of the fields to limit your results to only those fields.  For example, perhaps I want to see all rejected traffic.  I can use

[version, accountid, interfaceid, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action=REJECT, logstatus]

The format of the filter also represents the content of the fields in the VPC Log.  For example, the fourth field in your log is the source IP address of the traffic.  That is followed by the destination IP address and then source and destination ports.

When I run the filter above, I see several external systems trying to connect on port 23 (telnet) and port 80 (http).  I’m not using telnet (of course!), and I’m not running a web server, so port 80 is closed at the security group layer.  It is likely that this traffic is malicious attempts to hack into my EC2 instance.  But, I don’t have to worry about it, because it is all being rejected.

security-groups-block-malicious-traffic

In the first row that I expanded, we can see that an attempt was made to connect to port 23, and this attempt originated from IP address 78.10.107.73.  In the second expanded row, we see an attempt to connect to port 80 from IP address 72.21.217.71.  Both attempts were rejected.

For a few more details on the fields in VPC

Exporting VPC Flow Logs

Filtering flow logs is convenient for a quick look at your network traffic.  For example if you are trying to allow two instances in different security groups to communicate and it is not working, you might be able to quickly see what traffic is allowed and rejected between them by filtering on source and/or destination addresses.

But if you want to do a more detailed analysis of your network traffic, Flow Log filtering is not the way to go.  For this, you need to export your logs and then import them into another tool such as a relational database or other analytical system.

You can export flow logs to S3, stream them to Lambda, or stream them to ElastiSearch.  To do so, go to CloudWatch, click “Logs,” select your log group and click the “Actions” button as shown below…

how-to-export-vpc-flow-logs

Triggering Alerts from VPC Flow Logs

The ability to stream CloudWatch logs to Lambda functions means it is possible to write custom logic such as alerts to notify you of security issues.  One example might be that you want to be alerted of any rejected traffic originating from within your VPC.  Rejected traffic might indicate something such as a compromised web server that is being used to probe the rest of your network.  I would not fire alerts based on rejected traffic from external sources.  Any public IP address will constantly be probed for weaknesses.  Good Security Group settings and a good Web Application Firewall will protect you from those attacks.  Rejected traffic originating from within your network, on the other hand, can be a real cause for alarm.

Lambda offers a built-in template for building a function that processes CloudWatch Logs such as VPC Flow Logs.  Filters can be applied to avoid triggering the Lambda function too often which may go a long way towards reducing your costs.  Writing and configuring this Lambda function is a subject for a future post.

 

 

Should I Buy Reserved EC2 Instances?

AWS Reserved EC2 instances offer a compelling cost savings. But if you are not careful, they may lock you in for higher costs than you really need.

Savings Through Reserved Instances

Your effective monthly cost for EC2 instances can be significantly reduced by selecting a reserved instance.  The longer your reservation period and the more you choose to pay up front, the more you save.

reserved-ec2-instance-AWS

From the chart above, you can see the us-east pricing for an m4.large instance as of May of 2016.  By committing to a one-year reserved instance, you can save 31% of your monthly cost.  By paying up front, you can save almost 43% for a one year commitment.  If you are willing to commit to a three-year contract, you can save as much as 63%.  These savings can be compelling!

Reserving the Wrong Size Instance Can Cost You More

Despite these savings, there are reasons not to buy a reserved instance.  Especially when you first start using Amazon Web Services.  If you are coming from the non-cloud world, you may be used to over-provisioning your servers.  Servers are a big purchase and they are expected to last at least three years and maybe five.  As a result, when physical servers are purchased, they are often bought in a larger size than needed to allow for growth over time and to guard against possible errors in estimating what size server is necessary.

Immediate Visibility of Performance Trends

If you are coming from an on premise model, you may not be accustomed to having immediate visibility of performance trends.  Or you may have had to wait for complicated monitoring systems to be installed and configured before you had visibility into this data.

In the cloud, AWS CloudWatch provides immediate visibility into performance trends on the CPU, disk, and network usage of your EC2 instances.  You can also set alarms to notify you if usage rises above a threshold of your choice.

The CloudWatch graphs below show an Ec2 instance that is almost certainly oversized.  In the last two weeks, the CPU usage has never exceeded 15% and has only rarely exceeded 5%.

EC2-CloudWatch-Metrics

The owner of this EC2 instance would certainly be using a smaller, less expensive instance, except that they made the exact mistake I am warning against and purchased a reserved instance prior to fully understanding their resource needs.  Because of this, they are locked into a more expensive instance than they need for a year.  Lucky for them, they chose a one year term rather than a three year term.

So What Should I Do?

If you are evaluating whether to buy a reserved instance ask yourself these questions:

  1. How certain am I of the load my application will put on this instance?
  2. Have I observed the performance and behavior of my application on a variety of instance sizes to know which one is best?
  3. Am I confident that load will remain consistent for the next one to three years?
  4. If I outgrow my EC2 instance, can I take advantage of elastic load balancing and auto-scaling to distribute the load to additional instances?
  5. Do I have time to perform this analysis, or do I need to simply make a choice and move on?

How certain am I of the load my application will put on this instance?

Are you deploying a new application to the cloud, or migrating and existing application to the cloud?  Do you have metrics of what load the application will put on the EC2 instance?  If not, before you select a reserved instance, you should consider observing the behavior of your application and confirming the right instance size before you reserve an instance.  Though you will spend more to run on an on demand instance for the initial testing and monitoring period, you might save significant money over the life of the reservation by ensuring you do not reserve an instance size large than you need.  Use the “Monitoring Tab” on the EC2 control panel to observe CPU, network and disk usage of the instance.

Have I observed the performance and behavior of my application on a variety of instance sizes to know which one is best?

Before you lock in a one or three year contract for a given instance size, try a few different sizes to see how your application performs.  To resize an EC2 instance, Follow these steps:

  • Stop the Ec2 instance
    • On the EC2 console, select your instance
    • Choose Actions -> Instance State -> Stop
  • Resize the instance
    • On the EC2 console, select your instance
    • Choose Actions -> Instance Settings -> Change Instance Type
    • Select a new instance type
  • Start the instance
    • On the EC2 console, select your instance
    • Choose Actions -> Instance State -> Start

The whole process takes only a minute or two.  If you cannot tolerate a minute or two of downtime, consider putting an Elastic Load Balancer in front of your instance.  This will allow you to add and remove instances from behind the load balancer without downtime.  The load balancer also offers the ability to monitor request latency.  This allows you to see whether a new instance size has caused a slower response time for your application.

If you choose to use the elastic load balancer, you will need to take an image of your EC2 instance and then launch a new instance of a different size.  Add the new instance to the load balancer and remove the old one to shift traffic seamlessly between instances.

Am I confident that load will remain consistent for the next one to three years?

Reserving an instance locks you in to paying to use that instance for the entire term of the reservation.  If you expect that your traffic may increase or your application may change such that you outgrow that instance type before the reservation term is complete, then reserved instances may not be the best option for you.

If I outgrow my EC2 instance, can I take advantage of elastic load balancing and auto-scaling to distribute the load to additional instances?

Leveraging elastic load balancing and auto-scaling can significantly reduce the risk that you outgrow your instance and can significantly reduce your cloud costs.  Elastic Load Balancing distributes the traffic for your application across a group of instances.  If your traffic grows, you can add more instances.  If your traffic drops, you can stop the unneeded instances.

Auto Scaling automates the process of adding and removing instances in response to traffic changes.  Combining auto scaling and load balancing can optimize your cloud costs by running extra instances only when your traffic demands them.  It will automatically terminate unneeded instances when the traffic drops.

If you go the route of load balancing and auto scaling, you should reserve only the minimum number of instances you plan to run all the time.  Let the other instances be on demand.

Do I have time to perform this analysis?

An important question to ask yourself is whether you have the time and interest in optimizing your costs.  If you are already over tasked and think you may never get around to evaluating the right size for your instance, perhaps you want to just pick a safe, oversized, instance and move on.  Perhaps the cost savings offered by selecting a smaller instance are not worth the time you would invest.  If this is the case, you may be best off reserving a one year term rather than allowing the project to linger on indefinitely all the while paying on demand prices.

Conclusion

The correct use of reserved EC2 instances can save you significantly on your cloud computing costs.  But reserving an instance before determining the size you really need can lock you in to an oversized instance and end up costing you more in the long run.  You should take the time to test your application on a variety of instance sizes to ensure you select the best size for your needs.  You should also try to use Elastic Load Balancing and Autoscaling to optimize your cloud computing costs by responding dynamically to traffic load.