Blog

Are you ready for AI

Back Blog
Share

Introduction

Data analytics is changing our world in unprecedented ways, driving new products, business models, and breakthroughs in every industry. Machine learning helps companies act strategically on the data they have, homing in on the insights with impact, and allows executives to make much better informed decisions.  In a recent study, it was shown that companies were 26 percent more profitable than their industry competitors, generated 9 percent more revenue through their employees and physical assets, and enjoyed 12 percent higher market valuation ratios when using this type of data centric approach.

However, data at its core is not organized or clean, and it certainly doesn’t come already structured in a database. It’s a messy swamp of details of what we buy, where we drive, what we surf online, what we “like”. It’s transactions, chats on social networks, cellphone conversations, tweets, texts, photos, video, your web search and browsing patterns. Every minute of every day, data is being generated, by each of us.

A hyper-focused data science approach can give companies of any size a competitive advantage, but unfortunately, data alone has no intrinsic value. Sense needs to be made of it before uncovering the magic behind the curtain. Simply knowing that is the first step, but taking that first step can be confusing.

Building this foundation will allow you to have an environment where data-centric decisions can be made, but do not start a data science project unless you know why you’re doing it and what it should look like when it succeeds.  Think very hard about the goals. Are you looking to increase conversion rates? Marketing ROI? Market Share? Customer Lifetime Value? Measure where you are now, where you think you could be, and how much revenue that translates into.  As a wise man once said, “if you can’t measure it, you can’t manage it”

Unfortunately, due to the business lifecycle of most companies, management expects an almost immediate return on new initiatives. With AI/ML, before a company can optimize its business or build data products more intelligently, the infrastructure needs to be developed.  One of the recipes for disaster is for startups to engage its first data science contributor who only specializes in cutting edge modelling, but has little experience in building the strong initial foundation that is the prerequisite of everything. The messy swamp of data needs to be collected and made into a nice, clean lake.  Only then can patterns, insights and predictions begin.

As with most things that are worth doing, making a data science program effective can take substantial effort and (like everything in tech) will require several iterations before the AI/ML is structured in an effective way.  Don’t give up

The Pyramid

Imagine your company’s data journey as one climb where on top you have true power, predictions which will separate you from the competition and give you an advantage. This climb is a slippery slope full of challenges but when you do it right, your data insights team will have an easy time coming up with ideas and testing them in production systems.

Usually clients are reading blogs on AI/ML adoption topics and they are coming with various requests without first questioning the state of their infrastructure. Is your data stored in a centralized place? Can you pull it out easily? Can you visualize it easily? Can you explain trends, anomalies and causality?

The majority of companies think they are already on top and just need to apply same solutions as the competition and their business will flourish. However, the story is a bit different. Data is all over the place, it is not well structured, important data points are missing, the data science team cannot access data easily, there is no possibility to visualize data nor to test hypothesis, infrastructure is lacking possibility to apply simple A/B testing.

The Data journey should start at the bottom in the Collect phase. Most systems are old and were built before this information revolution, and they are not capable of storing huge amounts of data. On the other hand, startups are rushing to build something presentable and are choosing common tools (relational databases) which cannot cope with these amounts of data. Forward thinking companies are making the decision to move to the cloud (because of elasticity to expand data infrastructure) and build data lakes, where data will be stored in a centralized place and easily accessible for data science teams. Easy access and all the important data points stored in one place is major prerequisite for the Analyze phase which unlocks the true potential of your data.

The Analyse phase is an interesting phase, it is bringing technical and business people to the same table. The first step is to talk about pain points, what prevents businesses to do their job most efficiently? Do you struggle with analytics? Do you know how many resources to put to some task in order to most efficiently do the task? Where do you plan to be in the next quarter money wise? How will you prevent your customers churning to competition? We have created business insights as a service for this phase specifically. We need two things here, pain points of business owners and a portion of data to analyse. What we come up with after this workshop is an idea document which will list potential projects that can ease up business pain points and are viable, based on data analysis we perform. Also, we concentrate on ROI since estimates and timelines are given, so you can calculate investment and compare to potential ROI.

Last but not the least is the Predict phase, where we leverage the true power of the data. We have project ideas with ROI analysis and can decide which is the best to tackle first. Pick one that sounds best from a ROI perspective, and make sure you have all roles in your team. A Data Scientist is just a small part of the team building algorithm for the project, you need Data Engineers to choose the right tools for the job, as well as infrastructure engineers familiar with data problems to integrate this new project with existing infrastructure.

The Healthcheck

Our Data Science healthcheck is a short list of questions which you should consider as the minimum framework of needs, and covers the basic mindset of anyone thinking about Deep Learning and AI. It helps formulate the answer to the simple question of: Are you Data science ready? 

Having the right architecture and foundation for your data science functions is as important as having bricks and mortar for a physical office. Why, you ask? Because not surprisingly, data scientists need data. Many techniques require a minimum of tens of thousands, if not hundreds of thousands or even millions of data points to build.

Basically, If you are a startup and you have not launched yet, you do not need a full-time data scientist. Without a basic understanding of what drives the organization, it becomes very difficult to make use of modeling techniques. For example, a data scientist can use Machine Learning to make predictions like which users will churn or become highly active, however, if you don’t have a definition for these, you are setting up the DS program for failure. It's difficult to validate them if you don't have sufficient metrics with which to evaluate.

Check out our Healthcheck on this link and let us know what you think.

Conclusion

There is no need to be an expert in data science to hire one, however you should have a good idea of what is and isn't possible so that you don't set unrealistic expectations. Data science isn't magic and it's not even a traditional science. It's just as much an art as it is a science, which means the variability in skills and ability is substantial. Don't expect magic from your data strategy on day 1.

If you go for data scientist without first realizing where you are in terms of AI readiness you will be surprised and find soon enough that your data scientist cannot work efficiently with data. He will complain that infrastructure is not ready, he does not have all the information, he cannot test his hypothesis, he cannot pull and visualize data.


The second most important thing is not to use data science for the sake of data science. Read what your competition is doing, listen to the market and try to pick up on things you think will work for your business. Create a clear data strategy and define goals so you can evaluate investments in data science. Then decide whether you want to build your data team slowly or you want to kick start an internal data team with the help of a company which has already implemented a couple of projects.

Next post

Nenad Bozic

Co-founder & CEO

Software engineer with more than 10 years of experience currently focused on data intensive systems. Certified Cassandra developer and Datastax MVP for Apache Cassandra for 2016/2017. Strong believer in balance between good technical skills and soft skills. Striving for knowledge is his main drive, which is why he enjoys learning new tools and languages, blogging, working on open source, presenting.