“Lack of skilled workers is biggest big data challenge”

by Darinia Khongwir    Feb 12, 2013

Tony Young
The market for big data solution in India is expected to touch $153.1 million by 2014, according to IDC growing at a CAGR of 37.8 percent for the period 2011-2014, with IT services and software contributing the major chunk of the overall market. Understanding the growth and adoption of big data from an enterprise perspective in India, the study revealed that with information and data management becoming the most important goal of Indian enterprises, 40 per cent organizations have already used over 100 TB data at present. The key verticals in this area include BFSI, media & entertainment, telecommunications, and government. CXOToday caught up with Tony Young, CIO, Informatica, to understand how big data impacts CIOs.

How do CIOs perceive big data? Are they taking a holistic view?
The vendor community is shaping the definition of what big data is based on their agenda. For different people, it varies greatly. For some big data is hadoop and if you ask the Internet-based companies, they will have their own paradigm of what it is.
But if you take a step back and pull back the lens, Informatica takes an all-encompassing view just as many industry analysts do so too. It is a confluence of three distinctive indexes. It’s big transaction data. We know that transactions volumes are growing enormously. So, your ERPs are also growing. What they’re also doing is finding specialized appliances around them like in Greenplum, DATAllegro, etc. For some it is SAP HANA.
We have the world of big data around social, mobile device data. These are just some of the things that are generating big volumes of data- traditional Twitter, Facebook, etc,. But, what a lot of people will also say based on industry is genomic data or device-sensor data. All of these create vast volumes of data and below that we have hadoop which is an open source way of storing and managing this type of information.

What challenges do big data pose to CIOs? And how should they deal with it?
I think that the challenge for a lot of CIOs is their ability to access this information; how do you correlate it; how do you relate it and process it. This is where the big data comes in. So as a CIO that is the challenge ahead.
I think the issue that a lot of us have as CIOs is we have come to realize that if we don’t do something now; if we don’t start figuring out how to deal with all this data – our successors will. It will mean the end of my job. It is not just the futility of keeping my job, it is also the importance data has for enterprises and how we compete as a company. This is one of the rare cases when in IT you can create a sustainable competitive edge. Implementing a sustainable CRM system itself does not create a sustainable competitive edge. But I can use data in unique ways that my competition can’t. I can create a sustainable competitive edge. I can create markets that my competition cannot. I can operate at a different velocity than my competitors. So, there is no generic big data solution that we all buy. But it coalesces around industry. For example, I worked for a large semiconductor manufacturing company. They had six different pilots going on – big data pilots. Even in ideal conditions they would test it. Think of the all the data that gets created from the test equipment? What they are trying to do is analyse all of this test data from the equipment so we can reduce the error rates in the chips. If there are fewer errors, it means a lot more money. And they know that if it increases their yield and profitability, it means they can charge less and squeeze out the competition. In the end, it creates new business opportunities. That is why CIOs are feeling very challenged and trying to understand what is the right thing to do.

How can organizations maximize their return on data by leveraging technology innovations to increase the business value of data, while also deploying smarter ways to lower the cost of managing data?
I worked for 10 years in HP. Data in HP is called the ’spaghetti diagram’. It is all inter-connected. It is the same for most IT companies. If I make a change to any part of the system, I’ll have to trace all the data and see where it goes. At some places, the language used is PLC code, some Java, and other different languages. Most people go WOW! This is what IT enterprises look like. But this is just a small picture. In reality, it is massive and ugly. So, I invariably get asked, tell me about the challenges for IT spending? I spend about 20% of my budget on innovation, 80% on run rate – keeping the lights on. But if you cut this in half and there are guys who do apps and ups. Just managing the interfaces takes over 40% of the overheads on my business. So just every day when someone comes to work, if I make a change, I have to fix the interface, I have to coordinate. It is a mess in most companies. So, if you want to compete on big data, you will have a major issue. So, the first thing if you want to save money is to reduce labour cost, the software cost and consultants costs. So how do we do that? We take those apps and we look at them like an integrated top-bottom buttons links of various links. Here is where Informatica comes in - we try to connect everything and call it integration. And the way we connect is we don’t write code anymore. We just point and click – we want this to go here and this to go here. It is all automated. So now, all of a sudden, we’ve taken that 40 percent and made your business go faster. Every time you want to make a change in the future, it will be faster as well.

How should businesses ensure they have quality data at hand?
If I want to buy a company, I will bring their systems here, integrate it with ours and then kill the systems. There is could cloud, big data or hadoop, SAP, EMC Greenplum, Oracle, Salesforce.com – whatever it is – we connect and we automate it. But the problem with it is that we have to apply data quality. Because when I give you data, it has be authoritative, trustworthy, timely, relevant and actionable. And most CEOs will ask: ‘I want to understand, if I buy this company what customers do they and have and what are ours? What is similar and dissimilar because we want to do up sell, cross sell and depending on the information received. At Informatica, we can have that kind of information ready for the CEO in a matter of hours. This data is actionable, timely and relevant. We not only fix the data by ensuring quality, we also remaster it. So if you have a website for putting up customer data, from marketing, sales, ERP – who is the customer? I don’t know because we all touch it somehow. How do we know whom we do we treat you as a premium customer because you have bought over $ 100 million USD worth of products from us and not $ 10 million because I don’t know who you were? So we come here and we give single view customers, with single view product. This is creating the return for the customer. So we can reduce the cost while increasing the return. Be it big data cloud or social sources like Facebook, LinkedIn and Twitter. That’s how we work.

What kind of challenges will big data have on enterprises in the future?
The biggest challenge big data will have for every CIO and they will talk about is the shortage of skilled workers. Just like in the semiconductor industry where there are thousands of programmers, however, there are merely five fully skilled personnel who can do the job. The same it will be for big data. For 2013, that is going to be a huge challenge for CIOs but it will be a huge opportunity for Informatica.