Last night I spoke at the Melbourne R User Group (MelbuRn) about data science from a strategic business perspective. It was great to see so many people attending.
My presentation outlined the strategy that I developed and am implementing for my employer Coliban Water. This strategy is based on a common-sense approach that leverages our existing strengths. This strategy was also outlined in an article for the Source journal.
Water utilities are, pardon the pun, awash with data. For decades we have been using a rudimentary version of the Internet of Things called SCADA (Supervisory Control and Data Aquisition). This system controls our assets and provides operators and analysts with the needed data. All this data is used to control our systems and stored for future reference.
There is no such thing as ‘dark data’. All data has been used for its designated purpose when it was created. My job at Coliban Water is to create value from this information.
In this presentation, I explained how Coliban Water is creating more value from data by following a systematic strategy,
Data Science Strategy Presentation
The Data Science Venn Diagram (Conway, 2010).
In my job as manager data science for a medium-sized water utility in Australia, I have developed a strategy to increased the amount of value we extract from data.
Many businesses that seek the promised benefits of Big Data don’t achieve those because they don’t start with the basics.
The most important data science strategy advice is to spend a lot of time getting to know and to improve data quality.
Good data science needs to comply with these four basic principles:
- Utility: The analysis needs to be able to improve reality, otherwise we end with ‘analysis-paralysis‘. Although we speak of data science, it is really data engineering because we are not seeking the truth, we seek improvement of reality.
- Soundness: The analysis needs to be scientifically valid so that managers can make reliable decisions.
- Aesthetics: Visualisations need to be pleasing to the eye, not as a beautification but to ensure users draw correct conclusions.
- Reproducibility: Analysts need to be able to repeat the work of other people to ensure quality control. This is where the science comes into data analytics.
I have recently published a paper about data science strategy for water utilities to share some of my thoughts on this topic.
Data Science Strategy for Water Utilities
Abstract: Big Data promises future benefits by using smart algorithms to improve the customer experience. Many organisations struggle leveraging the benefits of the data revolution. This paper summarises how water utilities can use the emerging field of data science to create value from information. The paper explains the principles of data science and illustrates these using examples from water utilities. This paper closes with recommendations on how to implement data science projects to maximise value from data. These benefits are realised using existing investments in information technology infrastructure and existing competencies.
You can read an extract of the paper on the Australian Water Association website. The full version is behind their paywall.
Furthermore, I am interested in creating an alliance with other water utility professionals that write code in R. Feel free to comment below to discuss any thoughts you might have on this issue.