4 reasons why most data science projects fail
The following is a guest article from James Roberts, chief data scientist at Quisitive.
Experts have called 2017 the year of data literacy and digital transformation. While data is a key component that drives true digital transformation, too often companies approach data and analytics projects the wrong way. In fact, a mere 13% of data and analytics projects reach completion, and of those that do, only 8% of company leadership report being completely satisfied with the outcome.
Why are data science project results so dismal?
Most failures can be traced back to four major pitfalls: starting with the wrong questions; using faulty data; weak stakeholder buy-in; and lack of diverse expertise.
Recognizing these common hazards upfront puts CIOs and IT Directors in a much better position to lead data science projects that drive valuable insights, and contribute to an organization’s overall successful digital transformation.
1. Starting with the wrong questions.
All too often data science projects begin by analyzing data with the expectation that an interesting insight will reveal itself and become the basis for the business case that justifies the transformation.
This "exploratory analysis" approach often generates dozens of potential data projects that could yield compelling results. But do any of them produce a strong business case that can, for example, reduce costs, encourage repeat customers or retain employees? The scope of an "exploratory" project is simply too broad to drive useful analysis and is a waste of IT resources.
The better approach is to initiate the project with an established goal that maps directly to creating business value. Projects that follow the "hypothesis-testing" approach begin with a specific set of clearly-defined questions that indicate which data should be analyzed.
This targeted approach streamlines the data mining and analysis process by pairing business justification with business action, thereby directing IT resources to the information most likely to produce credible and meaningful findings. Starting with the right question sets the stage for a successful data science project through increased accuracy and efficiency, resulting in purposeful insight.
2. Using faulty data.
Using accurate data is fundamental to a project’s success, but bad data is the most underestimated cause of failure. Oftentimes companies simply do not spend enough time cleansing data. Because of the criticality of clean data, a good guideline is to allocate 80% of the projected project timeline to data clean-up.
While this may seem excessive, doing a thorough job becomes that project’s most significant time saver since working with clean data expedites all subsequent steps. Consider that even a simple error will produce a faulty insight with the potential to sink an entire project and cause leadership to withdraw support for future digital transformation initiatives.
Modern cloud and data ingestion tools facilitate the consolidation of unstructured data, which can then be pulled out, mined and correlated in different ways, making data easier to manage while reducing time, infrastructure and errors down the line.
3. Weak stakeholder buy-in.
Data science projects may impact business leaders across the company. Without stakeholder support and commitment to implement changes, projects could be stalled or fail.
The best way to ensure business alignment across the organization is to produce a solid data strategy and roadmap to keep everyone on track.
Stakeholders must believe in the project’s purpose and commit to following through on implementation in their department when the time comes. When stakeholders see the value in an initiative supported by a solid business case, the likelihood of project failure caused by a stakeholder barrier greatly diminishes.
4. Lack of diverse expertise.
It’s a common misconception that any project involving data should be the sole responsibility of the IT department. This short-sighted view couldn’t be more dangerous to a project’s success. Using the right data analysis tools is important, but the output falls short of offering meaningful and useful insight. Involving the right human talent in the project is essential, irrespective of the department they ascribe to.
The most successful data science projects employ team members across departments with diverse skillsets, including an understanding of quantitative research, statistics and subject matter expertise dependent on the initiative’s question and focus.
Together, they can bring different perspectives, proficiencies and experience to mold the project’s objectives and direction as it progresses.
Data projects also benefit from having someone on the team who understands the internal workings of the business to ensure it remains aligned with original business goals. More eyes on the project up the chances that mistakes will be caught, while leveraging the collective knowledge and talents of the entire team.