AZ-305 Azure Solutions Architect Expert learning pathways (part 2)
The Design infrastructure solutions is the forth learning paths (the first three are covered in part 1 of this article), it is the bulkiest of the paths and it contains four modules: design an Azure compute solution, design an application architecture, design network solutions, design migrations.
It starts by presenting a decision flowchart to select a candidate compute service. The term compute refers to the hosting model for the resources that your application runs on. The following table contains references to all the compute services covered in the learning path:
Service | Hosting model | When to use | Things to consider |
---|---|---|---|
Azure Virtual Machines | IaaS | When you need full control over the OS, runtime, and software stack | Requires management of OS, updates, and scaling; more operational overhead |
Azure Batch | IaaS | For large-scale parallel and high-performance computing (HPC) workloads | Needs job and pool configuration; pricing based on VM usage; fewer jobs, more tasks |
Azure App Services | PaaS | For hosting web apps, REST APIs, or mobile backends with minimal setup | Limited control over underlying infrastructure; supports auto-scaling, CI/CD, EasyAuth |
Azure Container Instances | PaaS | When you need to quickly run containers without managing servers | Good for short-lived, isolated container workloads; not suited for complex orchestration |
Azure Kubernetes Services | PaaS | For orchestrating containers at scale with Kubernetes | Requires some Kubernetes knowledge; powerful but more complex to manage |
Azure Functions | PaaS | For event-driven, serverless applications or background jobs | Best for small, stateless tasks; cold start latency can be a factor; code-first, compute-on-demand; Durable Functions |
Azure Logic Apps | SaaS | For automating workflows and integrating services with minimal code | Limited customization; ideal for business process automation and integration; part of Azure Integration Service |
Next, it touches on the subject of application architecture , namely planning how the app components will communicate through messages and events, improving performance through caching, creating a front door for API management and security, automating app deployments through IaC and decoupling configuration from code through configuration management. The following table contains references to all the application architecture services covered in the learning path:
Service | When to use | Things to consider | Example scenarios |
---|---|---|---|
Azure Queue Storage | For simple, persistent, FIFO message queuing between application components | Best for basic messaging needs; limited features compared to Service Bus | Decoupling backend services; buffering workloads like image processing tasks |
Azure Service Bus | For enterprise-grade messaging with advanced features like dead-lettering | More complex; supports queues and topics; higher cost than Queue Storage | Order processing systems; integration with legacy enterprise apps |
Azure Event Hubs | For ingesting large streams of telemetry or event data at high throughput | Designed for big data scenarios; integrates well with real-time analytics tools | IoT telemetry ingestion; application performance monitoring |
Azure Event Grid | For reactive, event-driven architectures using pub/sub model | Serverless; low latency; supports Azure and custom event sources; runs on Azure Service Fabric | Triggering automation on blob upload; CI/CD integration; serverless workflows |
Azure Cache for Redis | For caching frequently accessed data to improve performance and reduce load | In-memory store; requires capacity planning; cost varies by size and features | Caching product catalog data; session state storage in web applications |
Azure API Management | For publishing, securing, and monitoring APIs for internal or external use | Includes rate limiting, authentication, and analytics; can be complex to set up; useful when the number of apis you manage increases | Creating developer portals; API gateways for microservices |
Azure Resource Manager templates (with Bicep) | For deploying and managing Azure resources as code | Bicep is a more readable DSL than raw ARM JSON; requires understanding of Azure RGs | Infrastructure as Code (IaC) for consistent, repeatable environment deployments |
Azure Automation | For automating manual, long-running, or scheduled tasks in Azure | Supports PowerShell and Python; can manage both Azure and on-prem systems | Automating VM maintenance; patch management; backup scheduling |
Azure App Configuration | For managing application settings and feature flags centrally | Integrates with Azure services; secure key-value storage | Centralized config for distributed apps; toggling features via feature flags |
Network solutions based on workload requirements is the next topic covered in the learning path. There are several requirements that you need to consider, starting with naming conventions, regions + subscriptions, IP address ranges and network segmentation. The best practices it references have to do with IP address ranges that should not overlap, creating segmentation through Azure Firewall and segment subnets based on application layer, using NSGs to filter network traffic and implementing hub-spoke topologies.
It mentions three common networking patterns for organizing workloads in Azure, summarized in a comparison table:
Compare | Single virtual network | Multiple networks with peering | Multiple networks in hub-spoke topology |
---|---|---|---|
Connectivity/Routing | System routing provides default connectivity to any workload in any subnet. | System routing provides default connectivity to any workload in any subnet. | No default connectivity between spoke virtual networks. A layer 3 router (such as Azure Firewall) in the hub virtual network is required to enable connectivity. |
Network-level traffic filtering | Traffic is allowed by default. NSG can be used for filtering. | Traffic is allowed by default. NSG can be used for filtering. | Traffic between spoke virtual networks is denied by default. Azure Firewall configuration can enable selected traffic, such as windowsupdate.com. |
Centralized logging | NSG logs for the virtual network. | Aggregate NSG logs across all virtual networks. | Azure Firewall logs to Azure Monitor all accepted/denied traffic sent via a hub. |
Unintended open public endpoints | DevOps can accidentally open a public endpoint via incorrect NSG rules. | DevOps can accidentally open a public endpoint via incorrect NSG rules. | A spoke virtual network open port doesn't allow access. The return packet is dropped via stateful firewall (asymmetric routing). |
Application level protection | NSG provides network layer support only. | NSG provides network layer support only. | Azure Firewall supports FQDN filtering for HTTP/S and MSSQL for outbound traffic and across virtual networks. |
Azure routes communication traffic between your on-premises internal resources and external internet resources by using route tables. A routing table contains many different types of routes, including system, service endpoints, and subnet defaults. The table also has route entries for the Border Gateway Protocol (BGP), user-defined routes (UDRs), and routes from other virtual networks:
Route Type | Description | When to Use | Routing Priority Order (Lowest to Highest) |
---|---|---|---|
System Routes | Default routes automatically created by Azure for basic connectivity | Always present; used for communication within a VNet, to on-premises, or internet | 1 – Used unless overridden by more specific routes |
Service Endpoints | Routes added when service endpoints are enabled on a subnet | To optimize traffic to Azure services (e.g., Storage, SQL) by keeping it within the Azure backbone | 2 – Preferred over system routes for Azure services |
Subnet Default Routes | Automatically created for each subnet (e.g., local subnet prefix) | For internal subnet-to-subnet communication within a VNet | 3 – Acts as baseline connectivity |
BGP Route Entries | Propagated by Azure VPN Gateway or ExpressRoute connections | When using VPN/ExpressRoute to connect Azure to on-prem networks | 4 – Override system routes if more specific |
User-Defined Routes (UDRs) | Custom routes you define for specific traffic control | To force traffic through NVA, Azure Firewall, or control custom routing scenarios | 5 – Override system and BGP routes |
Routes from Other VNets (Peering) | Routes learned from peered virtual networks | To enable cross-VNet communication via VNet peering | 6 – Applied after UDRs, with certain restrictions |
The following services provide connectivity from an on-premise network to Azure resources, an always-on connection also known as hybrid-cloud network:
Solution | Benefits | Challenges | Common Scenarios |
---|---|---|---|
Azure VPN Gateway | - Quick to set up - Cost-effective - Suitable for site-to-site, point-to-site, and VNet-to-VNet VPNs | - Bandwidth limited (~1.25 Gbps) - Subject to internet reliability and latency | - Small to medium businesses - Temporary or dev/test connections |
Azure ExpressRoute | - Private connection (not over internet) - High bandwidth (up to 100 Gbps) - SLA-backed | - More expensive - Requires provider coordination - Longer setup time | - Enterprise-grade workloads - Data-sensitive or latency-sensitive apps |
ExpressRoute + VPN Failover | - High availability with redundancy - VPN can route traffic during ExpressRoute outages | - Adds complexity - Requires careful route configuration and monitoring | - Mission-critical systems requiring high uptime - Disaster recovery scenarios |
Azure Virtual WAN + Hub-Spoke | - Centralized connectivity and security - Simplifies global routing - Integrated with Microsoft backbone | - More complex design - Requires Azure Firewall/Network Virtual Appliances for filtering | - Large enterprises with multiple branches or regions - Unified security and routing |
Azure offers load-balancing services for distributing workloads across multiple computing resources. These services can be categorized across two dimensions:
- Global or Regional
- HTTP(S) or non-HTTP(s)
The following table contains a summary of these load-balancing options:
Service | Purpose | Layer | Key Features | When to Use | Works With |
---|---|---|---|---|---|
Azure Front Door | Global HTTP(S) load balancing and acceleration | 7 | SSL offloading URL-based routing WAF caching geo-routing instant failover | To improve performance and reliability of global web apps edge-based routing and WAF | Application Gateway (for regional routing) Traffic Manager (legacy or fallback) |
Traffic Manager | DNS-based global traffic distribution | 7 (DNS-level) | Priority/weighted/ geographic-based routing no SSL termination no path-based routing | When you need simple, DNS-based failover or multi-region routing without inspecting HTTP traffic | Load Balancer or Application Gateway (for per-region load balancing) |
Azure Load Balancer | High-performance, low-latency load balancing | 4 | TCP/UDP support zone-redundant inbound/outbound NAT high throughput | For internal or external L4 traffic VMs AKS nodes NVA scenarios | Application Gateway (L7 on top) Traffic Manager (for cross-region failover) |
Application Gateway | Load balancing with application intelligence | 7 | URL/path-based routing SSL offloading WAF cookie affinity session persistence | For web apps needing advanced HTTP/S routing or security controls | Front Door (for global entry) Load Balancer (for backend VMs) Traffic Manager |
In order to protect your network resources you can use one service or a combination of the following services:
Service | Purpose | Key Features | When to Use | Works with |
---|---|---|---|---|
Azure DDoS Protection | Mitigates large-scale Distributed Denial of Service attacks | Always-on traffic monitoring Auto mitigation Telemetry Cost protection | For public-facing endpoints (e.g., web apps, APIs) requiring resilience against volumetric attacks | Azure Firewall App Gateway (WAF) Load Balancer |
Azure Firewall | Centralized, fully managed L3-L7 traffic inspection | Application and network rules FQDN filtering Threat intelligence Logging | For controlling and inspecting outbound/inbound traffic between subnets, VNets, or to the internet | NSGs (for subnet-level control) Virtual WAN Private Link WAF |
Private Link | Securely connects to Azure services over private IP | Bypasses public internet Supports PaaS services and private endpoints | When secure access to Azure PaaS (e.g., Storage, SQL) over internal VNet is required | NSGs Azure Firewall App Gateway |
Web Application Firewall (WAF) | Protects web applications from OWASP threats | Pre-configured rule sets Custom rules Integrated with App Gateway and Front Door | For HTTP/S security at application layer Public-facing apps vulnerable to web attacks | Application Gateway Azure Front Door DDoS Protection |
Network Security Groups (NSGs) | Control traffic at subnet and NIC level | Allow/deny rules based on IP, port and protocol Stateless | For enforcing basic access controls within a VNet at the subnet or VM level | Azure Firewall (for central policy) Service Endpoints Private Link |
Service Endpoints | Extend VNet access to Azure services over the Azure backbone | Optimized routing no public IP needed improved security | When VMs in a VNet need fast and secure access to PaaS services (e.g., Storage, SQL) | NSGs (to control endpoint access) Azure Firewall Private Link (for more isolation) |
The last topic in this learning pathway is designing suitable migration strategies for on-premise workloads. It references the Cloud Adoption Framework migration model which takes you through the Prepare phase first, which then inputs into the Asses/Deploy/Release phase cycle. The advice is to:
- start small - find a mixture of simple and complex workloads that you want to migrate (no more than 10)
- choose between the broad strategy patterns for migrating workloads to the cloud: rehost (lift and shift), refactor (repackaging), rearchitect, rebuild
- migrate the workloads and decommission on-premise infrastructure
- optimize the migrated workloads - analyze migration costs and plan for reducing costs
- monitor the workloads
- repeat until no other workloads needs migration
The following services can be used to migrate various workflows from on-premise to Azure, the ones that are linked are mentioned in the learning pathway:
Migration Type | Azure Services | Description | When to Use |
---|---|---|---|
Structured Data (Databases) | - Azure Database Migration Service (DMS) - Azure Data Factory - SQL Server Migration Assistant (SSMA) | DMS handles homogenous/heterogeneous migrations; ADF supports ETL workflows | Migrating SQL Server, Oracle, MySQL, PostgreSQL to Azure SQL Database or Azure Database for PostgreSQL |
Unstructured Data | - AzCopy - Azure Storage Migration tools - Azure Data Box | AzCopy for online transfers; Data Box for large datasets | Moving file shares, images, logs, backups to Azure Blob/File storage |
Offline Data | - Azure Data Box - Azure Import/Export - Azure Stack Edge | For secure, large-volume offline data transfer via physical devices | When network transfer is impractical or too slow due to size (e.g., TBs or PBs of data) |
Servers (Physical/VMs) | - Azure Migrate - Azure Site Recovery | Azure Migrate assesses and moves VMware/Hyper-V/physical servers; ASR supports real-time replication | For lift-and-shift of on-premises VMs/servers to Azure IaaS |
Web Applications | - App Service Migration Assistant - Azure Migrate - GitHub Actions/DevOps pipelines | Analyze and move ASP.NET, PHP, or Node.js apps to Azure App Service or Containers | For modernizing or rehosting web apps to Azure PaaS |
Virtual Desktops | - Azure Virtual Desktop (AVD) Migration Toolkit - FSLogix - Azure Migrate | Tools to assess and migrate existing VDI environments to AVD | For moving Citrix, RDS, or on-prem VDI workloads to Azure Virtual Desktop |
You can often combine these tools during large-scale migration projects using Azure Migrate as a central hub, especially when moving multiple asset types together (servers, data, and apps).
The last two learning paths are covered in part 3 and part 4 of this article.