Cloud Migration is getting ubiquitous now a days. Every Enterprise is attempting to move their workload to Cloud because of — Fast Innovation, Digital Transformation (Yeah, Covid-19 is highly responsible for this now), Scalability, On-Premise Data Center getting closed (yeah, this one is also common now a days 😅 ), Cost, and many more reasons. Many of them have mastered the Art of Migration, but for Enterprises who are doing it for the 1st time, they need to think through quite a few things.
There are multiple articles available to highlight the Migration Strategies. One of them published by AWS is — https://aws.amazon.com/blogs/enterprise-strategy/6-strategies-for-migrating-applications-to-the-cloud/. Do take a look at this and other similar articles.
Going a bit off the track — I generally ask this one question to the Application Team who is planning to migrate the workload to Cloud— Is the Workload to be Migrated is Cloud Ready !!!
The answer to this question helps the team to freeze on the Migration Strategy to execute (Remember the 6 R’s, sorry its 7 R’s now !!!), thought to put here to set some ground. Well the intent of this question is — would your current Application Architecture help you reap the benefits offered by the Cloud? Does it meet all the standards required in terms of Security, Compliance, etc. for your application to be hosted in Public Cloud? Is the current Recovery Time Objective (RTO) and Recovery Point Objective (RPO) acceptable? If yes, lucky you !!!. But if not, do you have enough time to Refactor or Re-Architect the application and then migrate? If not, then you may want to execute a pure Lift and Shift OR may be a Lift and Shift with Tinker Migration Strategy and then once in cloud with workload running in stabilized mode, start making changes to leverage the Cloud Native Services and Managed Services as applicable to reap full benefits of Public Cloud.
Ok, now coming back to this blog — It covers some of the key checklist items around Actual Migration (Implementation) phase. Let’s take a look —
1) Designing Cloud Infrastructure
Designing the Cloud infrastructure for hosting the Application and ensuring it follows all the security and compliance requirements of the Application is a key ask. Additionally, every enterprise also has their own set of standards and guidelines to be followed. If I take reference of AWS, this point would stress on — Finalization of Region for hosting the workload, Designing VPC leveraging Multiple Availability Zones , Connecting to S3 and DynamoDB via VPC Gateway Endpoints, Connecting to other AWS Services via VPC Interface Endpoints, Blocking/Blacklisting of the known IPs via Firewall, WAF or NACL to handle threats, etc.
If there is a plan to use Golden Images (AWS AMI for example) baked by the Enterprise security team with all the security vulnerabilities patched, etc. ensure that Applications are deployed on those AMIs only. Ensure enough checks in the Cloud such that any usage of non-recommended images are flagged as Non-Compliance and is communicated to the Application team. This can be done with services like AWS Config Rules if we are deploying in AWS public cloud.
2) Cloud Network Security — Deciding on the Ingress and Egress Traffic
If there is a need (and I am sure there will be) for having Firewalls in place to filter network traffic to protect from external threats, where will they be hosted is an important question to ask?
Options — Setting up new Virtual Firewall in the Cloud OR diverting all the Internet Egress Traffic from Cloud via On-Premise Firewalls and Ingress Traffic via On-Premise Firewalls to Cloud (assuming trust relationship between On-Premise and Cloud Zones). It all depends on the latency and throughput requirement of the application. And yes, Dollar numbers also play a crucial role in the decision 😏. So make a choice accordingly.
There are many other items which comes under the Security bucket — would refrain my self from clubbing those under this blog. That itself can be a big big topic of its own.
3) Identify Workload Dependencies
Does your workload has any kind of dependency on the applications which are still running on On-Premise or vice-versa. If no, all OK. But if yes, how would you establish the connectivity between the two? Yes, I am referring to Hybrid Cloud environment. Do you need Dedicated network connectivity (AWS Direct Connect for example), or would VPN suffice. How would DNS look-up happen for the endpoints running in On-Premise and Cloud? How do you set-up DNS Forwarders in Cloud and On-Premise? These are some of the questions that you need to provide solution for.
I know what you are thinking — What if the Dependent application are available on Public endpoints? Hmmmm…..If this is the case, and your security or compliance team is OK with it, then enjoy the day and have a🍺.
4) Data Migration Strategy
This is an absolute important item. How would you migrate your data from On-Premise to Cloud? Data can be anywhere — Network File system, Local Drive, Database, etc.
It all boils down to what kind of data is to be migrated — Hot data , Warm data or Cold Data and how is application using it. Is the Data maintained in partitions like — Year/Month/Day/Hour wise, or is data non editable once saved, etc. or is it maintained separately for each Tenant (assuming Multi-Tenant Application) which can be moved to Cloud one at a time. There will be many other factors apart from the ones mentioned above which helps you to decide the actual Data Migration strategy.
Some of the standard patterns for migration are:
- Migrate the historical data to cloud and then on the final migration day, take a downtime on the application for some time (minutes to hours) and copy the delta to cloud and you are done. Safest approach. But involves downtime on the application.
- Use 3rd Party Sync services to have near real time sync from On-Premise to Cloud. For instance Oracle Golden Gate for Database sync, AWS File Gateway to sync data between NFS and S3. Has, almost zero Downtime for the On-Premise application. But adds more cost and some overhead.
One also needs to consider the logistics part — a) How much data is to be migrated, b) How much time do you have for migrating the Data, c) What is the channel available for migrating the data — does it provide enough bandwidth to push the data from On-Premise to Cloud in a timely manner? d) Using External storage devices for sending the data to Cloud — think of Snowball kind of devices.
And the last one — Are you thinking of Fallback to On-Premise should anything go wrong when you move to Cloud. If yes, then you also need to think through how would you sync the data back from Cloud to On-Premise. I know….not easy, but if your application cannot take a downtime for hours, etc. then this is the option.
Ok, one more last one — Ensure that encryption at rest is enforced for data that requires it as per compliance. For example, using KMS service for encrytping data at rest for S3, RDS instances, EBS volumes. (I am a die hard fan of AWS, hence all my examples are from there 😏)
5) Traffic Dialing Strategy
Once you have the application deployed in Cloud, how would you dial the traffic from On-Premise to Cloud.
Couple of patterns:
- 100% Dialing in one shot — Change the DNS entry and migrate all traffic to Cloud at once.
- Staggered Dialing — In this strategy, Application is live on both On-Premise and Cloud environment and traffic is dialed between the two in a staggered way. May be traffic of just one Tenant is going to Cloud, rest is being served by On-Premise. Or you may dial 5% of traffic to Cloud and 95% to On-Premise. Or there could be other conditions based on which you may want to split the traffic.
Why would you go with #2 — It all depends on how critical the application is and can you afford to take a downtime if something goes wrong? Its always better to take a hit for users and take corrective actions rather than failing for all. This approach helps you to observe the Cloud system, how it is behaving, understand the performance. And once satisfied, increase the traffic percentage. Warning !!! It also brings more headache of ensuring the data is available in real time at both Cloud and On-Premise as you may want to fallback to On-Premise if Cloud system is not performing as expected and there are issues being observed.
6) DNS Cutoff
DNS in Enterprise is not that easy. Take a look at the DNS record TTL, lower it to minutes couple of days before the actual day of migration. Ensure that CNAMEs for the new Endpoints created in Cloud environment are properly added in the DNS system.
This item is as important as the actual migration. Ensure that you have all tools in place for capturing the Infrastructure level metrics and Application level metrics. Create proper Summary Dashboards in the tools, highlight data with colors like Green, Yellow, Red to give a quick understanding of which area to focus on and make sure that you have Alerting mechanism in place like Pagerduty, AWS SNS, etc.
8) Automation is the Key
Automation in Cloud is an absolute must. You must ensure that you have repeatable process for CI/CD for Application, Infrastructure provisioning following IaaC construct, flagging non-compliance for the resources running in Cloud.
9) Cost Optimization
My favorite checklist item. Well this will be an Ongoing activity and it would be tough to make a statement — Yes I have completed the Cost Optimization phase. I recently wrote a blog on this one. Do take a look at it for more details-
AWS Cost Optimization — Based on Experience
Running workloads in AWS Cloud is normal now a days and it does not take long for the team to realize that they are…
10) Application specific settings
If there are changes in your Public IP after hosting in the Cloud and if that is supposed to be whitelisted in the 3rd party systems that you integrate with, you may want to trigger a process to have that added. There could be other such checks required at the application level. Do take care of those.
Cloud Migration is not a Cake Walk. Every Migration is unique and comes with its own set of challenges. And once the migration is successful, it makes you learn one more way of — How not to fail when doing Migration 😃
You may also be thinking — How about Cloud Governance, Account and Resource Tagging Strategy? Doesn’t that require planning, etc. Well, I deliberately skipped them as I plan to have a different blog covering only them.
So stay tuned !!!