Andrew Shelley is the Global Account Director at Tectrade. Here he shares the danger of IT outages.
Forget cyber-attacks – simple IT outages could be your downfall
The commercial landscape has always been filled with a myriad of obstacles, however, due to the ever-increasing importance of technology and its centrality to business function, a company’s digital health is quickly becoming a top priority for business leaders. Cyberattacks are regularly touted as the biggest threat to businesses everywhere, and they no doubt give journalists a scandalous story and readers an antagonist – and who doesn’t love a good villain. The impact of cyberattacks are of course a threat to any organisation, but the prevalence of IT outages and failures in the financial sector is reaching near-epidemic proportions. For instance, the top 4 British banks experienced 64 outages in the second quarter of 2018 alone. Additionally, a separate report published by the FCA, found that security issues were responsible for just 119 of 646 operational incidents at financial firms over the course of a year, highlighting the far higher frequency of more mundane threats such as simple IT failures.
Although cybersecurity is an undeniably vital aspect of the IT function, the commonality of outages calls for a rebalance of priorities, specifically, bringing focus to digital health with the aim of building more resilience into IT infrastructure. TSB’s well-known migration meltdown should be a cautionary tale to other financial institutions, as the two-week long outage in 2018 cost the bank £330 million, 12,500 customers and one CEO. This doesn’t just boil down to revenue loss either – an organisation’s reputation is also on the line. Although it’s far harder to quantify, instances like these can cause long-lasting damage to a brand and erode customer trust which, in an industry as critical as financial services, can be even more severe. The reputational damage has the potential to be far worse when you let down your customers as a result of digital mismanagement or short-sightedness, as opposed to being the target of malicious behaviour.
Whilst there may not be a shared cause behind these outages, what they do have in common is that the companies which suffered from them all made mistakes in overestimating the capabilities of their IT systems, employing too many legacy systems or simply not keeping their system health in check. All of this highlights the need for financial institutions to put more of a priority on routine IT operations, and specifically focus on what they can do to minimise downtime.
For many organisations, events like WannaCry live long in the memory and have informed company policy. While prioritising intrusion protection may guard against cyberattacks, the only fool-proof way to minimise the effects of an outage is to have effective back-up and data recovery protocols in place to revive any systems as quickly as possible. While no method can be truly infallible, a zero-day recovery approach is the best option for a quick recovery with minimal damage. As TSB has shown, it isn’t the outage itself that causes the most severe damage, but it’s the length of the downtime that is really the determining factor, especially when you consider Gartner’s estimation that outages may cost approximately $300k per hour.
Following such high-profile outages as outlined above, the pressure has been growing on the financial services industry to improve their response to security incidents and unplanned downtime. In fact, the Financial Conduct Authority last August proposed new rules on putting the maximum downtime of systems to two days, or businesses risk facing a hefty fine. Following numerous high-profile outages in the sector, people have furthermore become less patient and, consequently, firms need to be wary about the likely wave of compensatory lawsuits that may follow lengthy periods of downtime. To raise the stakes even further, the expectations of the modern consumer will more than likely see this time-frame reduced even further.
A zero-day recovery architecture allows administrators to quickly bring work code or data into operation in the event of any outages or system breaches, without having to worry about whether the workload has been compromised. An evolution of the 3-2-1 backup rule (three copies of your data stored on two different media and one backup kept offsite), zero day recovery enables an IT department to partner with the cyber team and create a set of policies which define the protocol for what they want to do with data backups being stored offsite, usually in the cloud. This policy assigns an appropriate storage cost and therefore recovery time to each workload according to its strategic value to the business – which in essence means it can prioritise data sets. It could, for example, mean that a particular workload needs to be brought back within 20 minutes while another, less critical, workload can wait a couple of days.
Financial institutions need to learn lessons from past failings and aim towards a zero down-time model. If they don’t, they risk damaging their brand, losing customers, revenue and potentially the whole business. This is all the more pressing for the traditional mainstay banks of the UK, with digital focussed challenger banks, like Monzo and Sterling, waiting to swallow market share with a much better digital track-record. All banks must prioritise the minimisation of downtime by having common sense policies in patch management, full knowledge of a system gained through disaster testing and a recovery strategy in place that enables it to get back online as quickly as possible.