What is configuration data?

Not sure what configuration data is?

What is configuration data? To start with, it's everywhere. It runs through every stage of your application lifecycle. And across all your environments. But what exactly is it?

A successful software application has settings, parameters and security information. And that's called configuration data.

What's more, people keep it in silos. That means the configuration data for each application, release or environment holds it separately. And then, the network or infrastructure that supports it also has a different category of configuration data.

As a result, the application has its own set of configuration data files for every phase of the delivery process.

Most of the time, there's no software in place to automatically navigate these silos and bring all those items together. Therefore, a software engineer doesn't know where within the pipeline it’s coming from and going to.

To put it another way, is it going from Development to Production? Alternatively, perhaps it's coming back from Operations to Development again. And therefore, what's the end result of all that confusion? A lot of untracked and unvalidated configuration data.

However, once you put it in a data model, you immediately see what's validated, tracked and secure. As well as what's not. So, most importantly, people can correct it right away.

what is configuration data made of?

What is configuration data?

ALL SORTS OF CONFIGURATION DATA ITEMS

Firstly, it's made up of a variety things from across the application estate. So here are a few examples of what configuration data contains:

  • API keys
  • Passwords and usernames
  • Feature toggle
  • Load balancing method
  • Database connections
  • Java HeapSize
  • Geo regions
  • Host names
  • IP addresses
  • Cluster settings
  • Sudo user lists
  • Patching level
  • Firewall settings
  • URLs
  • Certificates
  • Versions
  • Build numbers
  • Hot fixes
Drawing of magnifying glass and microscope

Why should it be stored centrally?

There are lots of reasons why you want to store configuration data in one central repository. Some of these reasons are: 

  • Segregation of duty
  • Separation of roles
  • Ability to track and audit
  • Regulatory requirements
  • Structure and simplification
  • Access controls
  • On demand delivery 

 

What is configuration data? Store

A CENTRALIZED REPOSITORY

As you can see, there are some key benefits to centralizing your configuration data. A major (and pretty simple) reason is that this data has numerous owners, editors and users. Typically, all of these roles impact each other and don’t even know it. Furthermore, teams with a co-dependent relationship (or integral interactions) make mistakes which another team feels.

Before we go any further, let's presume that your configuration data is typically:

  • Semi-centralized
  • Inconsistently stored
  • Randomly edited and accessed
  • Contains hidden secrets which aren’t 100% secure.

Clearly this isn't ideal, and the following explores three basic elements of what configuration data is and how it behaves.

GETTING VISIBILITY OF THE INFRA STACK

Undoubtedly, DevOps engineers own configuration data for a particular area (Infrastructure for example). So, engineers want to have full visibility over an infra stack. That is pretty key. Especially when something goes wrong and engineers need to know what changed, when it changed and who changed it. Clearly that's business critical. But interestingly, it’s also incredibly important for engineers to have access to the teams that rely on them or that they impact.

On the one hand, there are engineers that don't know the impact of their changes. And on the other hand, this represents a huge risk to business critical services. As a result, business services are vulnerable to issues because another team isn’t aware of who they affect with their decisions to upgrade, or deploy, or remove something.

What is infrastructure configuration data

THE IMPACT OF CHANGE

Without a doubt, this happens all the time. Owners of infrastructure don’t have full sight on applications or releases that are being run against them. Especially in an accelerated development models where time, quality and cost are all big rivals for attention. What's more, it's incredibly hard in real life situations for engineers to understand a simple question:

“If I change this item, who is impacted?”

Oftentimes, engineers have to simply try, cross their fingers and wait for the phone to ring. And sometimes it rings, but on the other hand, it might not. Unquestionably, when it doesn't, it's just as bad as when it does!

CONFIGURATION DATA DEPENDENCIES

Firstly, imagine that you are an application owner running upgrades that change what a microservice talks to.  Or you are running a feature which is dependant on another feature. And that feature is dependant on a database which, in turn, is talking to an end point. That's a fairly simple requirement but it's an intricate set of dependencies to manage.

Secondly, imagine that you only own a part of that upgrade. You have two, three and sometimes four or more teams upgrading four or more components. You have infrastructure which is co-dependent. With that in mind, if you get a single part wrong, it will be a catastrophe (or at the very least a headache) for someone to deal with. During office hours that's a pain but, more often than not, it happens out of hours. Because that’s Murphy’s law.

GAINING SINGULAR INSIGHT

Last but not least, teams need a singular insight into all settings and parameters, who is working with it and in what area. Above all, this provides knowledge of when things are upgraded, changed or removed. Not only that, but it means that they are done with 100% clarity and vision. First and foremost, teams with conflicting requests must be alerted BEFORE a change happens. Conversely, if teams disagree before a release kicks off, that's great. Absolutely let them collaborate, resolve the conflict and move forwards (and save them from the blame game).

At this time, when DevSecOps has so many moving parts, configuration data must be stored centrally and based on access requirements, secrets and collaboration structures. This keeps your configuration data estate in a sensible, pragmatic state. And, above all, gives your DevOps teams a chance.

Ask The Expert

Why configuration data matters for security, risk and compliance

Catch up with Sweagle's UKI Sales Director, Tim Reynolds, to find out why the correct management of configuration files is so vital to roles such as Head of Security and Head of Cyber Security.