AWS is the world’s most popular public cloud computing platform. Customers can build and scale projects of all shapes and sizes on AWS without on-prem infrastructure. While AWS enables a range of business benefits, it is still challenging for companies to run efficient, reliable, secure, and compliant workloads. For example, improperly sizing instances can be expensive and inefficient.
The AWS Well-Architected Framework helps organizations navigate the complexities of AWS management. It educates customers on the tradeoffs related to design decisions and shares the best practices for architecting reliable, cost-effective, efficient, and secure workloads in the cloud. As a result, organizations that adopt the framework can improve key business outcomes.
The AWS 6 pillars are the fundamental components of the framework. Each pillar details a framework category. The original AWS 5 pillars were cost optimization, security, reliability, operational excellence, and performance efficiency. In late 2021, AWS added sustainability as a sixth pillar.
This article will discuss the AWS 6 pillars and how they help organizations effectively design and manage their cloud infrastructure.
Summary of key AWS 6 pillars concepts
The table below summarizes the AWS 6 pillars we will explore further in this article.
AWS pillar | Description |
---|---|
Operational excellence | Focuses on the ability to run and manage systems to deliver business value and continuous improvement of supporting processes and procedures. |
Security | Focuses on securing information, systems, and assets while delivering business value through risk assessments and mitigation strategies. |
Reliability | Focuses on preventing and recovering from failures based on business and customer needs. |
Performance efficiency | Focuses on using computing resources efficiently to meet system requirements and maintain that efficiency as demand changes and technology evolves. |
Cost optimization | Focuses on running systems efficiently and effectively while maintaining cost efficiency to maximize return on investment. |
Sustainability | Focuses on leveraging cloud services in an environmentally-conscious way. |
An overview of the AWS Well-Architected Framework. (Source)
AWS pillar #1: Operational excellence
Operational inefficiency and incompetence can be viewed as a lack of processes and procedures that hinders the efficient execution of systems. It can include manual and time-consuming processes instead of automation, inadequate monitoring to diagnose operational issues resulting in poor incident management, and having a reactive approach towards operational issues instead of a proactive approach. The first of the AWS 6 pillars emphasizes avoiding these issues through operational excellence.
Operational excellence best practices
AWS considers operational excellence as the ability to deliver new features and bug fixes correctly and consistently. Organizations investing in operational excellence can deliver bug fixes and new features consistently while dealing with operational failures. When these processes are consistently repeated, they drive the organizations toward CI/CD (continuous integration and continuous delivery). The design principles and best practices for operational excellence are:
- Perform engineering operations as code. For example, use CloudFormation templates or Terraform scripts. This principle aims to reduce human error in manually handling engineering infrastructure and to create consistent results.
- Make changes in small batches. Small, easily reversible changes reduce risk and help increase development throughput.
- Pursue continuous improvement. Redefine or refine processes to ensure improvement in operational procedures along with evolving workloads.
- Anticipate failures. Test and validate various failure scenarios in your workloads.
- Conduct retrospectives. Let the operational failures end with a lesson to be incorporated into your daily operations. Ensure a similar failure does not happen the next time or is handled more effectively.
AWS pillar #2: Security
Security is a must for businesses running workloads in a public cloud. In AWS, security is a shared responsibility between customers and Amazon. Depending on the AWS service, the responsibilities of the customers and Amazon vary. You can read more about the AWS shared responsibility model in the official documentation.
Security best practices
By following the guidelines in the second of the AWS 6 pillars, organizations can reduce their risk of data leakage or a breach. Best practices related to AWS security can help teams harden their infrastructure, automatically enforce security policies, and reduce the risk of human error leading to an incident. Here are five fundamental design principles and best practices related to the security pillar:
- Follow the principle of least privilege. Build a strong identity foundation by implementing the principle of least privilege, enforcing separation of duties, and centralizing access management across accounts. This can be achieved by utilizing AWS Organizations to manage the IAM entities centrally, and using service control policies (SCPs) and customer-managed policies to ensure the principle of least privilege. Below are examples of a well-designed policy that follows best practices vs. a poorly-designed policy.
{ "Version":"2012-10-17", "Statement":[ { "Effect":"Allow", "Action": "s3:ListAllMyBuckets", "Resource":"*" }, { "Effect":"Allow", "Action":["s3:ListBucket","s3:GetBucketLocation"], "Resource":"arn:aws:s3:::DOC-EXAMPLE-BUCKET1" }, { "Effect":"Allow", "Action":[ "s3:PutObject", "s3:PutObjectAcl", "s3:GetObject", "s3:GetObjectAcl", "s3:DeleteObject" ], "Resource":"arn:aws:s3:::DOC-EXAMPLE-BUCKET1/*" } ] }
The above policy allows access to manage objects within the mentioned bucket. It follows the principle of least privilege by allowing access to manage objects only within a specified bucket.
{ "Version":"2012-10-17", "Statement":[ { "Effect":"Allow", "Action": "s3:ListAllMyBuckets", "Resource":"*" }, { "Effect":"Allow", "Action":["s3:ListBucket","s3:GetBucketLocation"], "Resource":"arn:aws:s3:::*" }, { "Effect":"Allow", "Action":[ "s3:PutObject", "s3:PutObjectAcl", "s3:GetObject", "s3:GetObjectAcl", "s3:DeleteObject" ], "Resource":"arn:aws:s3:::*" } ] }
This policy allows object management in all the S3 buckets. These loose permissions create a security risk by granting access to all the buckets in an account.
- Implement security monitoring and alerts. Monitor actions, trigger automatic alerts for anomalies and malicious activities, and take appropriate actions. Consider using a tool like GuardDuty to detect malicious activities in the account.
- Take a holistic approach to security. Address every aspect of security by implementing comprehensive controls in areas such as infrastructure, application, and data security, following industry best practices.
- Secure data in transit and at rest. Ensure the security of data in transit and at rest, maintaining continuous protection against unauthorized access. .
- Define incident management policies. Maintain a state of incident readiness by putting into practice specific incident management policies and procedures.
AWS pillar #3: Reliability
Like security, reliability follows a shared responsibility model between customers and AWS. Customers are responsible for ensuring reliability in the cloud, while Amazon is responsible for the reliability of the cloud. You can read more about the shared responsibility model for reliability in the official AWS docs.
Reliability best practices
The reliability pillar states that the workload works as intended throughout its lifecycle. The following are the design principles of the reliability pillar:
- Automatically alert when reliability metrics breach. Build automation around key performance indicators (KPIs) to alert for a reliability breach and automatically run recovery processes to repair failures.
- Perform thorough testing. Test recovery workloads by doing test runs in the cloud. You can also use automation to test different failure scenarios and to identify and fix them before any reliability issues arise.
- Scale horizontally. Scale horizontally to increase availability and avoid a single point of failure. For example, try aggregating multiple machines instead of using a single large one for your workload. If one goes down, the rest would still work to meet availability compliance requirements.
- Monitor workload demand and utilization. Automate the addition and removal of resources based on requirements and avoid under-provisioning and over-provisioning of resources. The use of load balancers with autoscaling groups is a good example.
- Adopt infrastructure as code (IaC). Use automation to make changes in your infrastructure, a process that should be monitored and tracked.
- Simulate failure to encourage improvement. Encourage development teams to schedule game days to simulate failure events and improve reliability.
AWS pillar #4: Performance efficiency
Using inefficient and low-performance compute instances can impact the efficiency of workloads. Managing hardware and software updates and upgrades may also affect the performance of workloads.
Performance efficiency best practices
Performance efficiency focuses on using computing resources efficiently to meet requirements, as well as on the ability to maintain efficiency in the face of evolving demands and changes. The following are the design principles of the performance efficiency pillar:
- Enable teams to focus on development, not infrastructure. Provide product teams with services and technologies that enable them to maximize the time they spend focused on product, not infrastructure.
- Consider going serverless. In serverless architectures, you only manage product architecture and development. The cloud provider manages the hardware and software dependencies. This can increase developer focus and efficiency.
- Ensure low latency. For example, deploy workloads across regions to avoid latency issues and use CloudFront where applicable to reduce latency.
- Leverage AWS Compute-Optimizer. Teams should experiment, analyze, compare, and review the resources used to identify compute utilization and make any changes that may be required. For example, Compute-Optimizer helps you identify under-provisioned and over-provisioned resources.
AWS pillar #5: Cost optimization
The cost optimization pillar helps organizations address the complexities of AWS costs like over-provisioning compute resources and choosing the right storage tiers based on business requirements.
Cost optimization best practices
The cost optimization pillar focuses on building cost-aware workloads that fulfill business requirements while maintaining the minimum possible costs and having a maximum return on investment. The following are the design principles of the cost optimization pillar:
- Establish essential cloud finance management practices. Ensure cloud finance management by establishing cost budgets and forecasts, implementing cost awareness in your processes, generating reports and monitoring them for cost optimization, and quantifying business value from cost optimization.
- Implement cost allocation tags. You can use cost allocation tags for tracking costs and gaining insights into costs per team or application, etc. It is easy to overlook, but a well-thought-out tagging strategy is essential for cloud resource and cost tracking in larger organizations.
- Enforce cost controls and decommission unused infrastructure. Be aware of expenditures and usage by implementing cost controls, monitoring costs and usage, and decommissioning resources when they are no longer required.
- Compare costs rigorously. Evaluate the cost of resources when selecting services, and compare multiple services offering similar services, e.g., evaluating the use of Lambda, EC2, or Lightsail for workloads. Also, focus on choosing the correct resource type, size, and quantity to meet business needs without overspending.
- Manage supply and demand of resources. For example, customers often use auto-scaling groups with their workloads to scale in and out as per the load while ensuring optimal costs.
- Review costs regularly. Optimize costs over time by having frequent reviews to analyze consumption.
AWS pillar #6: Sustainability
AWS 5 pillars changed to AWS 6 pillars with the addition of sustainability. As defined by the United Nations World Commission on Environment and Development, sustainable development is “development that meets the needs of the present without compromising the ability of future generations to meet their own needs.” With the addition of sustainability, AWS emphasizes the importance of environmental awareness and energy footprint, even in the cloud.
Sustainability best practices
The focus of this pillar is to reduce the direct or indirect impact of businesses on the environment. The following are the design principles and best practices for the sustainability pillar:
- Consider your workload’s environmental impact. Understand the impact of your workloads and evaluate the future impact that they may have on the business and the environment. Formulate key performance indicators and evaluate alternatives to improve efficiency and reduce impact in the future.
- Right-size your instances. Maximize utilization of your workloads by choosing right-sized instances. This will increase resource utilization and reduce the energy consumption of underlying hardware at scale.
- Leverage efficient hardware and software. Research and adopt new improvements that reduce the impact of workloads. Periodically monitor and adopt new hardware and software that are more efficient and reduce the impact on the environment.
- Increase your use of managed services for economies of scale. Using managed services and shared cloud platforms can increase resource utilization and reduce environmental impact.
- Cut down on energy consumption. Reduce the downstream impact of your workloads by reducing the amount of energy or resources required to use your services. This may involve optimizing code, server consolidation, and implementing power management policies. For example, you can optimize code by reducing unnecessary computations, implement server consolidation to maximize resource utilization, and set up policies to turn off servers during low-usage periods or utilize energy-efficient hardware components.
- Take a strategic approach. Develop long-term sustainability goals to reduce your environmental impact. For example, set targets for compute and storage resources required per transaction.
Best practices for following the AWS 6 pillars
The recommendations below can help organizations get the most out of the AWS 6 pillars.
Use the AWS Well-Architected Tool
AWS provides native tools to assist organizations with implementing best practices across all six pillars. It also provides the AWS Well-Architected Tool in the console to review your applications and workloads against the design principles and associated best practices and to identify improvement areas.
The AWS Well-Architected Tool console. (Source)
Use third-party governance tools
In addition to native tools, third-party tools can be used to assess and enhance the detections against these pillars and identify opportunities for improvement. For example, CoreStack provides comprehensive FinOps, SecOps, and CloudOps capabilities that can help teams govern cloud operations according to the AWS best practices, as well as assessments against all 6 pillars of the Well-Architected Framework.
Platform
|
Provisioning Automation |
Security Management |
Cost Management |
Regulatory Compliance |
Powered by Artificial Intelligence |
Native Hybrid Cloud Support
|
---|---|---|---|---|---|---|
Cloud Native Tools |
✔
|
✔
|
✔
|
|
|
|
CoreStack
|
✔
|
✔
|
✔
|
✔
|
✔
|
✔
|
Conclusion
The AWS Well-Architected Framework is designed to help customers build workloads that follow best practices to ensure business success. The AWS 6 pillars contain best practices related to multiple design principles. Workloads created following the AWS 6 pillars are more efficient, cost-effective, reliable, sustainable, and secure. Customers can build their organization's processes and procedures using AWS native tools and third-party tools to build controls specific to their needs. To get the most out of your AWS infrastructure, following the guidance in the AWS Well-Architected Framework is essential at all stages of your cloud development.