An Azure subscription is designed to host a large number of workloads. In enterprises these workloads (ex:- application workloads) often belong to separate teams, each with their own security priorities. Isolation of these workloads (VM belonging to workload-1 should not talk to VM belonging to workload-2, even though both the VMs might be in the same subnet) thus becomes an important priority. For the purposes of this article, we will call such separation, East-West isolation.
The rest of the article covers one methodology to achieve such East-West isolation using a combination of Application Security Groups (ASG) and Network Security Groups (NSG). We will also hint at some capacity planning considerations.
We will make the following assumptions:
- A single workload can contain more than one VM but we wont distinguish and design for the separate roles those VMs might play. Essentially, we will design so that all of the workload can sit in a single subnet.
- If at all two separate workloads absolutely need to talk to each other they have to reside in the same subnet.
An Application Security Group is a label that can be attached to the NIC card of an Azure VM. Typically servers that serve the same purpose are labeled together. We could then use this “ASG label” to set an NSG rule that targets all of those servers together.
A Network Security Group is a collection of 5-tuple stateful firewall rules that can be used to control incoming or outgoing traffic either at an entire subnet or at an individual NIC card.
We will work in the context of a network design that looks like the below. An Azure subscription hosts a Virtual Network which is further carved out into multiple subnets. The Virtual Network is connected into on-premises via Azure Express Route.
As a natural extension to our assumption, we will assign each workload a separate ASG. This means that all VMs within that workload will be labeled using the same ASG as shown below.
We now create an NSG. For every workload we on-board into a subnet, we add two inbound and two outbound rules to the NSG.
- A rule that prevents the workload from talking to rest of the subnet.
- A higher priority rule that over-rides the above and allows VMs within that workload to talk to one another.
This NSG is associated to each and every Virtual Machine’s NIC created in that subnet.
For every subnet we create in the Virtual Network, we associate an NSG that prevents those subnets from talking to its peers within the same Virtual Network.
This essentially has the effect of preventing the following.
- A workload cannot talk to its peers in general. If it has to, it has to sit in the same subnet as the one it has to talk to. We can then add additional higher priority rules in the NSG to allow that kind of conversations.
- Workload can talk to Express Route and other connected places in the organization but cannot talk to other workloads in a different subnet within the Virtual Network.
This scheme effectively helps us achieve our goal of East-West isolation.
Now to the question of capacity management. There are 500 ASGs per subscription. This means you will be able to at most onboard 500 projects into a subscription.
Each NSG has at most 500 rules. We will use 400 of these and reserve the other 100 for management. Let us allocate about 10 rules per project for current and any new requirements. This gives us the ability to group about 40 workloads into a NSG.
We have 200 NSGs in all that we can use. So, we can onboard about 8000 such workloads. However, the ASG limits prevent us from realizing this capacity
So ASGs are the limiting factor here. we can at most on-board 500 workloads into a subscription acheiving effective East-West isolation.
In the real world, there are additional complications that usually come into picture.
- North-South isolation, which means traffic between VMs within a workload has to be further restricted to allowing only certain types of inbound or outbound packets.
- VMs from different workloads playing a similar functionality (all web servers) might have to be grouped into a different subnet from VMs that serve as databases.
Such constraints will require us to revisit our earlier design. We will consider these in more details in a future blog post.