The 5 Pillars of Good Solution Architecture: Designing for Efficiency

Pillars of Good Solution Architecture

Designing for Efficiency

Sergio Barbosa (CIO, Global Kinetic)


The wave of cloud computing that hit the tech industry during the first decade of the century brought about the promise of reduced infrastructure costs with on-demand infrastructure utilization.  In layman’s terms you only paid for the infrastructure that you used for the time that you used it.  No longer did you have to purchase a powerful expensive server upfront that was able to handle your system’s peak workloads, and then have it sit idle for most of the time until it was needed.  With cloud computing the promise was that you could run your maximum workloads on powerful servers for the one or two hours that you needed it, and then scale that down to a small server for the rest of the time, drastically reducing your infrastructure costs.

That was easier said than done.  We quickly discovered that for this to be achieved you would need to have system diagnostics to know when you needed the big server and when you needed the small one, and for how long.  That means you needed to build this monitoring into your system from the onset so that the system can give you the diagnostics you need to make infrastructure decisions.  But not all systems are that predictable.  There are four basic models, and a single system can have a combination of these models if it is a more modern and modular or microservices-based system.

The microservices that power the finance department of a company for example might have very specific predictable demand at month end when payments are made and reconciliation processes are run, whereas the microservices that power the onboarding of new customers may have an unpredictable demand as some external forces could drive demand for new customer sign ups that weren’t previously anticipated.

Some systems may have a requirement for an on-premise component for whatever reason, and hybrid infrastructure architectures are very common.  It is important to ensure that your on-premise infrastructure does not become a bottleneck for your elastic cloud infrastructure in hybrid scenarios.

A good way to approach cost efficiencies for a system is to organize the infrastructure being utilized.  In most cloud environments you can make use of subscriptions, resource groups and tags to assign resources to different cost centres within a large enterprise.  Organizing system resources like this will help you optimize the spend.  Optimizations can be done at an IaaS (Infrastructure as a Service) level with compute and storage provisioning, or at a PaaS (Platform as a Service) level with database, blob and orchestration services like Kubernetes provided on demand by most cloud providers.

As mentioned before, key to understanding where you can optimize a system and make it more efficient from a cost and/or utilization perspective (we all want to save the planet right?), is through monitoring.  The formula is simple; Monitoring + Analytics = Insights.  Core system monitoring involves four specific things:

  • Activity Monitoring – the monitoring of changes to the infrastructure. The more automated an infrastructure environment is, the more important it is to monitor changes to that infrastructure.  Additionally, if you have distributed system administration team for a large system, changes can be made in one area that affect another, and without adequate monitoring, you won’t know why the system is degrading until it is too late.
  • Service health checks – having a simple health check on each service or executable part of the system, so that you can programmatically determine what is running, when and for how long.
  • Metrics and diagnostics – a simple set of metrics with diagnostics reported from each service or executable part of the system can be used to help you determine where performance issues are building up, so you automatically scale out/in and/or up/down.
  • Recommendations – use the recommendation tools that come included in IaaS and PaaS solutions and tweak the systems configuration to see how it responds, and do this incrementally with small tweaks so you can gradually improve the overall efficiency of your system.

Now that you have your monitoring in place, you can start working on automation.  Automation can add incredible efficiencies to operations.  There are three main areas of automation that you can focus your energies on:

  • Automating Development and Test Environments – by making it easier for you to provision Development and Test environments, you can easily spin up new environments and take them down when not needed.
  • Automating Operational Tasks – creating automated builds for a test-driven code base that also includes automation tests can dramatically drive down operational costs. Additionally, you can become a lot more predictable in delivering enhancements to your system and improve the overall quality thereof.
  • Infrastructure as Code (IaC) – Avoid creating what they call “Snowflakes” and instead write code for setting up networks, VMs, Load Balancers and Connections so that you can version control your infrastructure and like with the rest of your system provide incremental improvements over time in a managed way.

Designing for efficiency up front can add immense costs savings to your solution in the long run.  Building the metrics, diagnostics, health checks, automated tests and IaC to an existing code base is a near impossible task and the costs will undoubtedly outweigh the benefits.  Build these in upfront and reap the rewards.  Continue monitoring your system over time as system usage evolves and changes.  This way you will always can improve the efficiency of operations in the systems that you build.

If you missed any earlier parts from our series on the 5 Pillars of Good Solution Architecture, click here to read more.