Traffic Dialing When Migrating Workload to Cloud

Workload migration from On-Premise to Cloud is getting common now a days. Teams are migrating their workloads because of multiple reasons. They prepare a huge migration checklist but one area where I generally see Teams having question is — How do we dial the traffic from On-Premise to Cloud — what all options do I have?

To be very honest, every Cloud Migration is unique in some or the other way. It comes with its own set of challenges. So, there is no silver bullet for the problem.

In this blog, I have tried to summarize some of the common patterns used for Traffic dialing.

  1. All at Once
  2. Tip-Toe
  3. Time Based (leveraging the above 2 approaches with time duration)

Approach 1 — All at once — This is the easiest approach of moving the traffic from On-Premise to Cloud. It may or may not involve application downtime depending on external factors. For example, if there is an external tool being used for keeping the workload data in real time sync between On-Premise and Cloud, then it will not have any downtime. But if data is to be copied before migration to ensure data integrity, etc., then there will be some downtime involved and the duration of unavailability of the workload depends on the volume of data to be pushed to Cloud on the day of migration.

Once the Cloud Workload is up and running and all the Data migration is done, the DNS is updated and that’s it. You are done 👏. Traffic shifts to Cloud workload.

There is no impact on the external clients as they continue to use the same DNS name for reaching the application.

Approach 2 — Tip-Toe Dialing — This approach involves dialing traffic in a staggered way. There are multiple strategies of having tip-toe dialing done —

  1. Percentage based
  2. Client/Tenant based
  3. Geography based
  4. Could be more…

In this approach, you pick one attribute based on which Traffic will be dialed. The challenge in implementing such kind of approach is availability of a Smart Proxy Layer which can decide where to send the traffic depending on the attribute selected.

In the above scenario, a proxy layer was added before the Migration which is responsible for taking routing decision. Team keeps an eye on the Performance of the system and once satisfied, increases the traffic. This exercise is executed until the entire traffic is dialed to Cloud workload.

Approach 3 — One more variation of Traffic Dialing which can be executed in both the afore mentioned strategies is — Time based.

In this approach — the traffic is dialed to Cloud Workload for some time — let’s say couple of hours in the off-peak hours to observe the system. And then it’s dialed back to the On-Premise. Based on the performance and metrics collected, the duration of traffic dialing is increased and over a period of time the traffic completely switches to the Cloud Workload. This probably is the safest way of moving the traffic, but this can be done only when you have real-time Data sync happening between On-Premise and Cloud.

For those who do not have a Smart Proxy in place in their On-Premise deployment and are looking for a simple way to implement the same — Amazon Web Service (AWS) CloudFront and Lambda@Edge can be leveraged for the same.

Reference Solution — Route53 DNS resolves the traffic to CloudFront CDN endpoint which then makes use of Lambda@Edge Viewer Request based event interceptor to check the attribute against which traffic to be sent to either On-Premise workload or Cloud workload.

Hope this blog has given you some context about Traffic Dialing Patterns when doing Migration from On-Premise to the Cloud.

Do clap and share this blog with your friends/groups if you liked it.

Happy Blogging. Cheers!!!

#AWS #CloudArchitect #CloudMigration #Microservices #Mobility #IoT