Anyone can understand what ‘big data’ means in theory. But amid the think pieces and whitepapers, it can be easy to forget quite how gargantuan ‘big’ truly is. Consider a bank like HSBC. In 2020, it was reported that this single financial institution had a digital footprint worth 240 petabytes. One petabyte, any tech nerd will happily tell you, is worth 1,000 terabytes. But that still doesn’t really make the figure any more comprehensible, so perhaps a more practical analogy might help. According to one common standard, a petabyte is the equivalent of 20 million filing cabinets filled with data. Now take this army of metal and wood: then multiply it by 240. That’s 4.8 billion – or roughly the number of virtual filing cabinets HSBC has stored on servers and in the cloud.

The truth, of course, is that even this physical image is difficult to parse, and that’s precisely the point. Modern banks deal with psychedelic, stupendous quantities of information, quantities that would have melted the brains of even technical wizards just 50 years ago. Simply attempting to fathom it – the customer addresses, the credit ratings, the commodities sales, the first-pet passcodes, the mortgage certificates, the ‘your call matters to us’ voice memos – is liable to leave you curled up raving about jackbooted hordes of office furniture. Yet someone must try all the same, for if modern banking data is bewildering, it has the power to change the sector forever.

Yusuf Demiral, head of data analytics and customer relationship management, wealth and personal banking, HSBC.

Yusuf Demiral knows these tensions better than most. As the wealth and personal banking head of data analytics and customer relationship management at HSBC, he must grapple with those 4.8 billion filing cabinets for real, all while attempting to derive meaning from a mire of numbers and figures and stats. Perhaps unsurprisingly, Demiral sees his task as one “where science meets art,” requiring what he calls an “iterative process” to separate mere ‘noise’ from genuinely useful insights. As this comment implies, moreover, getting to grips with so much information involves more than pure maths. Rather, it obliges Demiral and his team to carefully manage wildly different data sets – and consider what exactly they want their information to tell them. It goes without saying that none of this is easy. But from client budgeting to call centres, it’s transforming how HSBC does business, with revolutionary consequences for the sector more widely.

Going big

Though HSBC still has a significant footprint of physical branches, it is also investing heavily in digital offerings. Image: ADRIAN3388/

Talking to Demiral, it’s sometimes hard to know where HSBC’s snowdrift of data truly began. Did it grow out of new financial services like credit cards or ATMs? Or were more recent digital technologies the sneeze that began the avalanche? The answer is probably a mixture of the two: but what can be in doubt is just how much information the bank now has. “Understandably, we have access to a plethora of data,” is how Demiral laconically puts it, with that headline 240 petabyte figure merely the start. That’s true, for instance, of client demographic data, or else their transaction histories. Individual clickstreams – the specific series of links that a customer clicks on – are a major dataset too. Nor should financial transactions be forgotten; HSBC processes around two trillion of them each year. It goes without saying, meanwhile, that all this information needs to be stored somewhere, and once again, the scale of HSBC’s operation can be hard to grasp. Encompassing data centres across 21 countries, the bank overall maintains 94,000 distinct servers.

Combined with the rising power of the cloud – which last year represented 27% of the bank’s workflows – and Demiral’s use of “plethora” feels apt. Yet if the scope of HSBC’s data is all-encompassing, it’s ultimately not there to sound impressive. “Each data point acts like a signal,” Demiral emphasises. “I like to think of them as a mine that needs to be carefully refined and thoughtfully converted into useful insights; only then can we use data for commercial purposes, for a value exchange with our customers.” A case in point, the executive continues, involves payment data. By analysing how clients spend their money, for instance, HSBC can offer practical tips on how to manage it, all while cutting frustrations like managing passwords.

For that to happen, however, HSBC’s data analytics team must first deal with all that ‘noise’ – for a bank as large as Demiral’s a veritable orchestra of sound. In part, his job is made easier thanks to clever data technology developments, as well as the fact that different datasets can be played off against each other. At the same time, HSBC staff are obliged to think carefully on what exactly they want their data to tell them. Once again, payment information offers a good example here. A personal financial management tool, the HSBC HK app uses data from the bank’s Hong Kong customers to help them budget. With that in mind, the app carefully separates everyday expenses from personal investment transfers, making it easier for users to appreciate where their money is going. In a similar vein, the system also categorises costs by type – dining is one, shopping is another – ensuring clients can understand exactly how much their lunches are setting them back.

Positive chains

If customers in Hong Kong are dramatically exploiting their data – the service now has 1.8 million users – they’re far from alone. Teaming up Google, for example, HSBC has set up an ‘Automated Quality Monitoring’ (AQM) system to track the bank’s call centres. Leaning on a sophisticated AI solution, it checks that HSBC staff explain the terms and conditions of particular products correctly.

Traditionally, Demiral notes, this process has been manual, with overworked personnel obliged to listen in on certain conversations. But by relying on a tireless robotic alternative, Demiral says the bank’s been able to expand coverage to 100% of calls, along the way eliminating “human effort” and errors. This approach obviously bolsters the average punter’s user experience: the moment the machine spots an error, a call agent can immediately contact them to resolve the problem. But as Demiral reference to “human effort” implies, the thoughtful use of data can also make life easier inside HSBC as well.

For one thing, the AI’s vigilance means that muddled staff can immediately be coached on the correct terminology. For another, it saves HSBC from spot checking calls at random. That’s of a piece, Demiral argues, with how data can prove useful to staff more widely. As he puts it: “It’s a chain of positive events – in simple terms: better customer experience and services makes more happy customers, which equals better commercial outcomes, as the bank gains trust and positive recommendations from customers.” In a broader sense, Demiral suggests removing manual drudgery can promote staff morale – perhaps an obvious suggestion to anyone who’s listened to the average customer service call.

Not that data can – or should – sweep all before it. As always when it comes to customer information, and especially sensitive financial data, HSBC must ensure that it’s employed ethically. “Banking,” Demiral stresses, “is the business of trust.” Given this, it’s no wonder that the bank has developed a detailed, publicly available guide, explaining precisely how its billions of filing cabinets can be used. That’s shadowed by more practical work. In 2020, for instance, the HSBC liased with the Monetary of Authority of Singapore to understand how various statistical methods can lead to implicit bias. That’s echoed by a partnership with the Alan Turing Institute, which among other things pondered how the bank could be open about its data while also keeping it secure.

“I believe the bank of the future will embrace the best of technology and humans coming together for better service and experiences – the perfect combination of face-to-face services combined with data intelligence fuelled digital interactions.”

Generation generative

Demiral is clearly proud of this work – as he says, HSBC was among the first banks to publicly state its ethical principles towards big data and AI. But if it’s not the only financial institution to suffer here, it’s also true that HSBC’s record of data protection is far from perfect. In late 2018, staff admitted that the private information of some US customers had been compromised. More recently, bankers at HSBC, alongside colleagues at Goldman Sachs and JPMorgan, have been accused of using WhatsApp inappropriately.

To put it simply, American officials are worried about WhatsApp’s end-to-end encryption, meaning it may be difficult to recover pertinent messages in the event of insider trading or fraud. At the time of writing, HSBC was in the process of finalising a settlement with regulators – and given penalties across the sector have so far hit $2bn, any deal is unlikely to be cheap.

These challenges aside, however, Demiral seems buoyant about how data can transform life on both sides of HSBC’s threshold. “I believe the bank of the future will embrace the best of technology and humans coming together for better service and experiences – the perfect combination of the face-to-face services combined with data intelligence fuelled digital interactions.” By way of example, Demiral highlights generative AI. A fair point: typified by tools like ChatGPT, it could soon be training automated models everywhere from stress-testing to fraud detection, with Demiral arguing it has the potential to support HSBC front-end staff and offer “hyperpersonalisation” to their customers. All the while, that horde of virtual filing cabinets just grows and grows and grows.