Digital solutions leader Tim Reed shares his RAAM framework to help keep your hardware, software and infrastructure up and running.
Within today’s world of digital solutions that deliver impressive new capabilities, there remains the less glamorous part of IT we all know, namely ‘keeping the lights on’.
Keeping the Lights On (KTLO) refers to the IT services that information technology teams provide to deliver and enable daily IT operational and performance requirements. Over the last year, more IT organizations seem to be taking the steps to recognize exactly what is happening in the background, and acknowledging what is takes to keep the lights on.
As a career technology leader, I’ve experienced production issues and the struggles to get the core of IT support hardened, discovering first-hand what an IT team member had ‘McGyvered’. Digital DevOps or DevSecOps can be a panacea, but this needs to be coupled with steps to address aging hardware, software and infrastructure.
I have developed a simple four-part framework—Recognize, Assess, Act and Monitor, or ‘RAAM’—that has proven very helpful for minimizing disruptive outages. This allows me and my team to focus on creating more value for the business through technology.
How many times have we faced a production issue that is impacting the business and our customers, and we have no clue what is going on? When faced with this challenge, the immediate reaction is to remediate the issue, then take steps to permanently remove the problem and prevent it from happening again. These are good first steps, but organizations should also make the effort to understand how and why the issue appeared in the first place.
Recently, I witnessed a root cause analysis (RCA) that identified a gap in monitoring. New monitoring was installed, remediating the issue and preventing further similar outages, but the process failed. It did not recognize that internal processes had led to sign off on faulty architectures with gaps in monitoring and key IT controls. Multiple outages in the same technology stack made it evident that the IT leadership had not addressed the lack of architectural expertise and due diligence.
When was the last time you participated in a technical debt assessment? Without current assessment information, you may approve a minimum viable product that is taking shortcuts to rapidly deliver a solution on bad infrastructure. A good assessment can reverse the trend and provide the facts about the current state of the technical stack that is supporting legacy IT solutions as well as new digital products. Assessing the current state and knowing the facts of aging hardware, unsupported software and toolsets that are at end of life, can be a great help to address current gaps, avoid added risk and future crises.
A recent technical debt assessment I participated in identified that over 50% of the technical stack was at end of life. Hardware upgrades and software patches had not been kept current or funded for several years. The assessment identified that the IT organization had not been diligent and had raised the risk profile of the company significantly.
Even more of an eye-opener was that a thorough technical debt assessment had not been performed in over 10 years. The infrastructure was crumbling and no one had taken the time to assess what was really going on.
My advice is this: if you have not done so recently, perform a formal technical debt assessment. Spend the time to assess the current state—the hardware, software, toolsets, and the skills on the team that is supporting your current IT environment. Doing so will pay off in multiple ways. Not only will you and your team have full knowledge of your current state, but with assessment in hand, you will be in a far better position to seek sponsorship and funding to fix critical problems.
Few companies like to spend money to maintain old technology. IT portfolio and funding reviews tend to focus on new solutions and capabilities, but companies must consider investing in what it takes to keep the lights on. Flip the dialog, look for those investments that will benefit and strengthen the business model.
As IT leaders, we need to act, explaining to our business partners how critical the infrastructure and toolsets are, and what will happen if they are not maintained. Take the time to find out what your team is doing to keep the lights on each day, recognize their effort and reward the individual or team, but also take action to solve root problems. The great opportunity here is that your actions to strengthen the core infrastructure and toolsets provide the runway to bring in even more new capabilities, growing the value of IT overall.
Ensuring there are processes and controls in place to continually monitor the IT infrastructure and address issues will pay off handsomely. Monitoring helps to avoid the risk of falling into the abyss of technical debt, with an IT team that grows more exhausted and disheartened with each passing day. Monitoring can be as straightforward as conducting a semi-annual review. Most portfolio processes today recognize ‘legal and regulatory’ placeholders for annual funding because they are monitored, often by organizations that are outside of IT. IT organizations have the opportunity to setup similar self-monitoring to identify what is needed to ‘keep the lights on’.
When monitoring identifies that something in your technology stack has been McGyvered, then fund and fix the problem correctly and permanently. Over time, this commitment enables you to focus your team members on new digital solution with higher business impact, which is what most of them crave.
Dare I say it? Keeping the lights on in today’s digital environment can be fun! Using the RAAM approach, IT leaders can drive towards strong IT solutions, products that leverage a finely tuned architecture and tools that monitor and prevent outages that negatively impact the business and its customers.