This blog post is part of the Big Data Week Speaker interviews series. In this article, Gerard Toonstra, , shares his thoughts on the adoption of big data technologies in retail and the challenges encountered when trying to leverage data.
- Some verticals have been quicker than others to adopt BI and analytics solutions. Why do you think that is and what industries or fields might bring the general public the most benefits through the accelerated adoption of big data technologies?
BI is all about the optimisation of business processes. These business processes are measured by recording and combining events that are generated within the business processes themselves, for example, the sale and scan of a product at a cash register, or scanning a pallet of goods arriving at a warehouse. Less competitive verticals like energy, where you typically have less bonding and interaction with customers, lag in the adoption of newer technologies. Other verticals like retail, where customer interaction is extremely important, typically adopt them more quickly. So in markets where customer loyalty is more volatile, you’ll see more evolved BI solutions being deployed.
Big data technologies would benefit the general public across all industries when you look at opportunities where these contribute to the common good. For example, IoT sensors that track energy and water waste, irrigation optimisation in countries with low water availability and generally solutions aimed at making our day-to-day lives more sustainable.
- How did businesses adapt so far to the impact of big data?
Businesses should at least have heard of big data by now and they have probably explored common technologies to extract some value out of this data. Deep learning is starting to influence certain processes that were once considered relatively opaque in nature, like analysing audio from phone calls or analysing and optimising web site photos. There still is right now, however, a significant struggle going on to deal with the basics of moving and surfacing data and simplifying and generalising day-to-day ETL work.
- Why is it important for organisations to become data-driven?
Being data driven means that you rely on facts to tell you what’s going on in and around the business, rather than relying on gut feelings. This helps to increase the optimisation around processes and allows us to learn much more from data if we are humble enough to allow your hypothesis to be wrong. Being data driven is as much a paradigm that seeks to improve decision making as it is about discovering new knowledge specific to your business and marketplace.
- What are the challenges encountered when trying to leverage data?
Getting the right people together to connect the dots. Data itself is there, but not everyone knows how to make the most out of it. This requires a mix of domain knowledge, business analytics, knowing what other companies do, and data science, in order to assist in evaluating the relevance and applicability of the discoveries. Around this, the challenge is to find the engineering talent and software to be able to quickly draw data for such analysis and put new ideas into production in a reliable way, compatible with the architecture and guidelines.
- How do you see the industry evolving over the next few years?
The retail industry is most certainly going through a time of change and rediscovery right now, at least in The Netherlands. Some companies here that have been around for years are in rough weather or have had to close their doors because they were unable to digitize their retail chain. I find it too difficult to predict how this develops over the next few years, I mostly wonder what the shopping streets will look like very soon, without the clothing and shoe shops lining them. What’s going to take their place? Are we going to have more social places like coffee houses, activity centers or casinos instead?
- How do you consider the new automation wave and self-teaching AIs will impact the world?
Kaggle hosts competitions where deep learning networks are able to recognise and predict diseases with more accuracy than humans, which implies that for specific, highly skilled jobs, AIs can be more effective than us. I’m not necessarily concerned about AIs taking over or becoming malicious, since it would imply that they have feelings of resentment or arrogance. From my point of view there is no danger in AI, unless we take into account the possibility that an AI may develop self-awareness and self-consciousness, and then feelings of disgust, temptation, etc.
- Tell us a bit more about your topic at the BDW2017 London Conference. Why did you choose this particular subject?
I chose the “Data Orchestration with Apache Airflow” topic because it continues to be a struggle to do regular ETL work. Airflow has some great abstractions and this allows you to generalise across different approaches so that over time things become a lot easier. It is also horizontally scalable, so it scales along with whatever your company is doing at the moment.
- Who do you think should attend your talk at Big Data Week? Why?
I think especially solution architects, product owners and data engineers can take some things away from my presentation. Most of it is technical, but I do explain the overarching why Apache Airflow is a superior solution to many other ETL tools for example and how it fits very well in a modern data driven organisation.
Gerard is a designer, architect and developer with 20 years IT experience ranging from telecom billing systems to drone control systems to business intelligence. In his spare time, he maintains a best practices site about Apache Airflow, thinks and writes about code complexity and competes in kaggle competitions.Gerard now works as a senior data platform engineer at Coolblue B.V. in The Netherlands.
Don’t miss Gerard’s talk at the upcoming Big Data Week London Conference, on October 13. Get your Early Bird ticket today!