Infrastructure as Code: How to Balance Automation and Safety

author-image

By

Photo by Ankit Dembla on Unsplash

Infrastructure as code (IaC) brings automation to the way we manage infrastructures and build applications, but many engineers think the trade-offs are too steep. Does automation limit a developer’s control too severely? What happens if automation goes haywire?  

Rosemary Wang is working to help engineers take advantage of IaC and automation while balancing safety and security. Through her book, Infrastructure as Code, Patterns, and Practices, and her All Things Open (ATO) talk, Wang offers practices and patterns to help organizations successfully implement and scale IaC and recover fast in the event of an automation failure. In this conversation recorded at ATO, Wang shares tips for building a strong IaC foundation, planning for break-glass scenarios, and avoiding common automation pitfalls. 

Listen to the full conversation. This conversation has been edited and condensed for brevity and clarity. 

In Case of Emergency

Katherine Druckman: Would you give us an overview of your talk here at ATO

Rosemary Wang: I come from a mixed background of network engineering, software engineering, and a little bit of everything. As a software engineer, I wanted to automate everything because it makes it easier. People had a fear of automation going wrong, and we never discussed what it meant to break glass—meaning, if automation did go wrong, in case of emergency, we had to get in and do something. We were so busy building, we never thought how we’d escape-hatch ourselves if automation was problematic or we had a failure that we couldn’t fix by automation. 

Anybody who does automation quickly realizes that the patterns you use to automate something don’t apply in certain scenarios. For example, let’s say a database just goes horribly wrong. No amount of automation is going to fix your old database. Maybe you can create a new one and migrate all the data over, but even that isn’t fully automated. My talk is about the patterns and the frameworks we need to fix our systems, even when automation isn’t there for us; how to identify when automation has gone wrong; and where we can explore manual access to a system, safely and securely. 

Common Automation Pitfalls

Katherine Druckman: What do people get wrong when they’re automating various software engineering systems? 

Rosemary Wang: The one rule we forget when we automate is idempotence, the idea that you can reapply the automation and the same state will exist. For example, a network engineer could run a script that says make this interface and restart it so that the new configuration applies. However, running that script again will configure the interface again and restart the switch again, which disrupts everything. Why should I restart the switch if the configuration is already correct? Idempotence says that if the configuration of something—such as software, infrastructure, or platform resources—is the same, I don’t need to rerun that automation again or change the state of the system. We’re so busy writing code we forget there’s a preexisting state. It’s the same thing as if you’re doing a create, read, update, delete (CRUD). If you don’t read to determine what the state is before you update, you’re just going to reapply the same configuration and potentially disrupt the system. This applies to any kind of automation you do, and I call it the golden rule of automation. 

The second most common pitfall I see is a lack of testing. There’s only so much you can do to test your automation, but if you can break it down into the four CRUD operations and test across them, it tends to yield a stable automation overall. You’re still going to get edge cases where your automation breaks because someone is doing an unexpected combination or configuration, but for the most part, you’ll build resilient and stable automation.  

Katherine Druckman: Once you have your automation in place, what are some areas you need to frequently revisit to figure out what’s going wrong and what needs to be improved? 

Rosemary Wang: Usually it’s old automation. Automation itself is software, and software has entropy over time. You’re going to have different patterns, applications, and infrastructure APIs running in your system. Eventually you’ll hit an edge case if you don’t iteratively audit what your automation is doing and who has access to which parts of your system in your automation, because even automation should have least privileged access. You don’t want your automation having admin access across the board; access should change no matter what kind of software you’re writing. Look carefully at entropy and access over time. 

Advice for Beginners

Katherine Druckman: Do you have any advice for people who are just getting into the IaC field? 

Rosemary Wang: Learn the software development practices. You don’t have to be the best software engineer or program in the most up-to-date languages, but recognize what practices people use to make and push software to production. A lot of that maps to the infrastructures and IaC we have today, and those are the foundations that provide a better way to step up your infrastructure automation game.  

I also recommend learning to use declarative language. Declarative vs. imperative is a big argument in software in general. Declarative describes the end state: So “I want XYZ,” and the system will try to get to XYZ in whichever way it chooses via automation and IaC. Usually we program in an imperative way, but we’re moving toward a declarative configuration in Kubernetes and many other tools. We’re getting to a place where we want someone else to write all the logic and automation for us to get to the right end state. You need to know how a declarative tool works, but you also need to understand the underlying imperative logic because it’s not always going to go right. Declarative logic will only get you so far, and eventually you have to debug further down the stack. 

Katherine Druckman: Software is growing more complex, and it seems impossible to see the big picture. Do you have any advice for finding your space in the software life cycle? 

Rosemary Wang: Asking questions and being open minded across different parts of the software development life cycle can help you get a better end-to-end view. When you’re in infrastructure, it’s easy to say I’m going to write this and deploy it in my environment, but sometimes it’s more important to understand how developers are planning to deploy their applications. You’re designing for other people like developers and security engineers, not deploying infrastructure for the sake of deploying infrastructure. People are using it. If you think about infrastructure as a product, it really helps you recognize where you need to investigate further and how to get a broader sense of the system. 

More Tools to Explore

Katherine Druckman: What else are you excited about in the open source world more generally?  

Rosemary Wang: I have too many interests and not enough time to explore them, but one of my interests is endpoint access and the thought process behind local to remote development. When I was a software engineer, I always had an issue with how to develop locally and then push it remotely and make sure it still works. It’s everybody’s dream. There’s also a cost aspect. Many people don’t want their development environment anymore, so they can cut costs, optimize all their resources, and have everybody do everything locally.  

There are some interesting open source toolsets around endpoint access. They’re called privileged access management tools, and I think they’re more dynamic than traditional tools. They help you treat a remote environment like an extension of your local machine. It’s popular in the Kubernetes space. You can spin up Kubernetes locally and do a bunch of stuff with it if you wanted to, but you could also have a remote cluster, access it as if it’s a remote endpoint, and it can behave as if it’s local. You don’t use your resources as a developer locally, and you could also optimize the resources in a development environment without thinking about it.  

There are also cool things people are doing with the Envoy proxy. I highly recommend it to anyone in the networking space. But you can also do, let’s say, shadowing of different data, and you can split traffic across different areas. So if you’re a developer who wants to use data from a certain database or something, you can use proxies to split traffic. There are a lot of interesting patterns that no one’s really explored. 

Build a Strong IaC Foundation

Katherine Druckman: Will you tell us about your book? 

Rosemary Wang: The book is called Infrastructure as Code, Patterns, and Practices. I wrote it because every time I talk to someone who is new to the space or is at an organization that introduces IaC, they’re struggling with how to scale it, how to collaborate on it, or how to implement the minimum foundations you need to succeed. The book explores what you, your team, and your organization can do to implement and scale a successful IaC practice. It also covers the technical aspect of it, such as clean IaC, and what it means to do CI/CD for IaC. It goes all the way to security, cost optimization, and IaC and toolset upgrades. It’s not easy, and it’s always evolving, but hopefully it offers a foundation for you to build on over time. 

To hear more of this conversation and others, subscribe to the Open at Intel podcast: 
 

 

About the Author

Katherine Druckman, Open Source Evangelist, Intel 

Katherine Druckman, an Intel open source evangelist, hosts the podcasts Open at Intel, Reality 2.0, and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she’s a longtime champion of open source and open standards. 

Rosemary Wang, Developer Advocate, HashiCorp 

As the author of Infrastructure as Code, Patterns and Practices, Rosemary Wang works to bridge the technical and cultural barriers between infrastructure, security, and application development. She has a fascination with solving intractable problems as a contributor, public speaker, writer, and advocate of open source infrastructure tools. When she is not drawing on whiteboards, Rosemary debugs stacks of various infrastructure systems on her laptop while watering her houseplants.