Five ways to approach data science governance

Pressured by global competition and disruptive technologies, a growing number of corporates are turning to innovation partnerships to stay ahead of the competition. But while these relationships usually start on a high note with some early wins, frustration often sets in later. According to Boston Consulting research in Europe, around half of both corporates and startups end up dissatisfied with results after the initial honeymoon period ends.

No enterprise leader would gamble on results, and the success of data collaboration hinges on one thing: accountability. They need to deliver measurable impact on a growing share of their enterprise’s KPIs, while ensuring the appropriate governance and privacy protections are in place to safely scale data prototypes.

Here are five ways leading organizations are taking a holistic approach to data science governance across workflows and technology to build competitive advantage.

 

Choose the right data governance method

In today’s data-driven world, appropriate governance for data science projects is critical. A centralized approach, where all data is consolidated in a single environment, might be simpler to manage but hits a wall when you’re trying to gain insights from disparate sources. Alternatively, a federated approach enables you to derive insights from data stored across different locations, without having to physically move it.

This method of taking the algorithm to the data – rather than vice-versa – has the added benefit of reducing risk of data leakage while eliminating data transport costs. A third option is a hybrid approach such as a centre of excellence, which could allow you to define and prioritize data science use cases based on business case, risk profiles or other factors.

Swapping and pooling insights, instead of raw data, across the data-sharing ecosystem allows data scientists to gain more value from their data, while ensuring privacy and protecting IP. Using Data Republic’s data governance and discovery capabilities, it’s now possible for any organization to run both centralized and federated data analytics at scale with full governance.

 

Focus on solving real problems

Many organizations will start with the data and hope it will yield something ‘interesting’ on its own. More typically, however, the most effective enterprise data science starts by building a deep understanding of an existing business process or customer problem, and pinpointing parts of the decision tree that can be augmented or automated. 

Aligning on a common vision is key to success when collaborating on such projects. All parties need to agree on the allocation of resources, expertise, and data sharing rules. But it can also raise important trust and privacy concerns, such as when organizations need to match customer datasets to enhance joint customer experience. Privacy-preserving technology such as Data Republic’s Senate Matching can enable data collaboration at scale without compromising regulatory compliance or consumer privacy.

 

Prepare to govern data at a scale that matters

Many organizations optimize to scale data, but become overwhelmed by the concurrent growth of the data science team and business stakeholders. Team throughput grinds to a crawl as information loss compounds from the number of interactions in a single project – much less a portfolio of 50 to 100. And then there are the complex legal, data security and privacy barriers to data sharing that can make DIY agreements difficult-to-impossible to define, let alone implement.

A tool like Data Republic’s Senate provides an end-to-end software solution for data leaders to govern data sharing projects across teams, subsidiaries, and external partners. Its unique licensing and privacy workflows make it easy for organizations to define, govern, and enact data sharing terms – reducing time-to-value from months to weeks.

 

Measure everything, continuously

Data scientists live in the world of measurement, but aren’t always able to keep an eye on the dials. It’s not uncommon for data scientists to train a model, deliver it into production and move on to the next project, with the critical process of monitoring to ensure the model performs as expected being assigned to someone else. Tracking patterns in workflows allows you to create modular templates for future data projects – including ongoing monitoring and measurement – and guides investment in internal tooling and people that can help to alleviate future bottlenecks. 

 

Risk and change management aren’t just buzzwords

Enterprise data science doesn’t usually fail because of the algorithms, but rather because of the humans who use the algorithms. This makes it absolutely critical that the organization provides data governance training which addresses topics such as:

  • What is the business’s purpose for collecting data?
  • What are the goals for the use of data?
  • How does the use of data align with the organization’s overarching goals and mission?
  • What policies will be in place for how data is collected and used?

Finally, be sure to provide employees and customers with feedback channels, and that you continually measure usage and engagement to ensure success.

Want to learn more about effective data science governance across the enterprise? Download our whitepaper ‘The Data Governance Effect: How to Structure Your Business for Data Liquidity’.