How to build (and execute) a real data strategy

data strategy CSC Blogs

When I think of strategy, I imagine playing a game of chess, coming up with patterns that put me in a better position to win: “If I can get my knight and queen connected in the center of the board, I can force my opponent’s king into a trap in the corner.”

But when I see recommendations for data strategy, they tend to read more like a set of rules rather than patterns for winning a game. I often read about data governance policies, security constraints or documentation procedures, necessary considerations no doubt. But these are things that, I think, should be handled by systems owners. And I don’t think of them as strategic. Knowing that “the D2 pawn can move to D3 unless there is an obstruction at D3 or the move exposes the king to direct attack” is necessary to play the game, but it doesn’t help me pick a winning move.

A colleague Simon Wardley taught me that to create a successful business strategy, you need a map. A map helps you figure out where to focus, what you need, how to proceed and why. In Wardley (Value Chain) mapping, you work out the customer’s needs, then find ways to generate competitive advantage by disrupting the value chain or driving different parts of the chain to commodity.

I wondered if this same basic method could be used to build winning data strategies.

Instead of creating a value chain of the customer needs, I created a value chain of the most important business insights needed. The chain was mapped based on the business value of the question and how long it currently takes to get a high-quality answer to the question. With this map, I focused on finding ways to shorten the time it takes to get the kinds of insights the company needs to be more competitive.

 

The start of a data strategy. Each circle is a business question. The questions are arranged in a map of...

The first step in the development of a data strategy. Each circle is a business question. The questions are arranged in a map based on the business value of the question and how long it currently takes the organization to arrive at high-quality answers.

I can break the business questions into groups that correspond to real data systems. This makes it easy to identify what is needed to execute on the data strategy. For example, I can identify legacy enterprise data systems, market and competitive intelligence data and customer profile data as the primary systems of interest for my current data strategy.

We can only move as fast as our slowest parts, so the map also tells us where we should focus our strategic efforts: Target the lower-value questions that are slowing the time-to-insight for the higher-valued questions.

The “why” is always to improve time-to-insight for business questions higher in the value chain. For example, we work to shorten the time it takes to understand our delivery capabilities so that we can get faster insights about our competitive position in a given market.

The last question that remains is how to shape the map from its current landscape to the one we want. We need patterns of action that have proven effective at enabling the kinds of changes we need. For business strategy, Wardley calls these actions patterns of gameplay. For data strategy, CSC’s Big Data and Analytics group has collected patterns of successful data gameplay into a concept that we call Industrialized Machine Learning (IML).

 

A refinement of the data strategy. We have identified groups of related questions as well as areas of the landscape we would like to somehow reshape.

A refinement of the data strategy. We have identified groups of related questions, as well as areas of the landscape we would like to somehow reshape. The strategy is missing the specific actions we need to take.

Industrial Machine Learning is a framework for ingesting data, building algorithms, deploying them into production and generating continuous insights to ongoing business problems. It’s a modern take on a very old idea: the scientific method.

Start with a hypothesis and collect data that you think can give you answers. Then generate a model and use it to explain the data. Evaluate the credibility of the model based on how well it explains the data observed so far and how well it explains new data that you collect in the future. The modern twist is that this process is done on an enterprise scale using digital infrastructure.

The evidence is a continuous pipeline of data that we collect. The models are business algorithms running in production, and the experiments are done in very short sprints that force us to focus on discovering insights in small, meaningful chunks.

 

CSC's Industrial Machine Learning is a framework for executing on data strategies and generating a continuous stream of data insights.

CSC’s Industrial Machine Learning is a framework for executing on data strategies and generating continuous streams of data insights.

With patterns from IML, we can start work on my favorite part of data strategy: planning ways to reshape the landscape.

Building legacy data pipelines and mapping the data to common concepts, for example, will increase visibility into enterprise resources and decrease the time it takes to determine if we have the ability to effectively sell in a given market. Deploying customer intelligence algorithms will provide a more nuanced understanding of changing customer needs and decrease the time it takes to recognize emerging markets.

 

Completed data strategy, including actions (based on Industrialized Machine Learning) that can help reshape the data landscape.

Completed data strategy, including actions (based on Industrialized Machine Learning) that will reshape the data landscape.

This approach to data strategy can have a big impact when applied in specific industries.

Natural Resources

In mining, for example, smart equipment management can save a lot of money and improve operations. Thousands of factors affect the performance of complex machines, but by using a data strategy based on IML, we can monitor operations and predict machine problems before they happen.

In our study, we found that the main cause of unscheduled maintenance was time spent waiting on other processes, crew meetings and training. The mining crew tends to spend its downtime maintaining the equipment, and it’s helpful to create a new maintenance category to track and give credit for these opportunistic maintenance events. Equipment damage happens after there is a lot of time spent waiting in delays due to mine blasting. The blast creates dust and, over time, the dust accumulates in the machinery. We found that we can prevent some failures just by alerting operations when a machine logs too many hours waiting for blast delays.

We used Industrial Machine Learning to generate insights about coalmine operations. We found, among other things, drivers for incidences of equipment damage.

We used Industrial Machine Learning to generate insights about coal mining operations. We found, among other things, drivers for incidences of equipment damage.

Healthcare

In healthcare, insights about hospital procedures can help improve patient care and hospital outcomes.

We used a data strategy based on IML to supplement hospital administrative data with rich information from the healthcare provider, including electronic patient records and other routinely collected data. We looked for features that were most important to predicting lengths of stay for patients undergoing hip or knee replacements. We found key leading indicators (such as the patient’s age, the patient’s core healthcare providers and the secondary diagnosis) for predicting lengths of stay. We built a regression model using the leading indicators, which allowed us to predict a patient’s stay.

Those predictions became the basis of operational dashboards that alerted hospitals of future costs and helped identify patients who may experience problems in recovery.

We used Industrial Machine Learning to Predictions form the basis of a dashboard that predicts future hospital costs and identifies patients likely to experience problems during recovery.

We used Industrial Machine Learning to predict future hospital costs and identify patients likely to experience problems during recovery.

Building a winning data strategy is much more than listing out the rules for security or data governance. It takes a map of the required business insights and frameworks like Industrial Machine Learning that prescribe ways to produce those insights at enterprise scales.

Building and executing data strategies is transformative for just about any industry.

 

Examples of the ways data strategy and proper execution can transform an enterprise.

Examples of the ways data strategy and proper execution can transform an enterprise.

Data has become a corporate priority, with more than half of Fortune 1000 firms reporting big data initiatives in production across the enterprise.

The true value of big data initiatives is the ability to generate continuous insights at enterprise scale. This takes a clear data strategy and industrial-strength execution. Tweet this.

We’re at the beginning of a new phase of big data — a phase that has less to do with massive data capture and storage and much more to do with producing impactful scalable insights. Organizations that adapt and learn to put data to good use will consistently outperform their peers.


overton-2015

Jerry Overton is head of advanced analytics research in CSC’s ResearchNetwork and founder of CSC’s FutureTense competency, which includes the Predictive Modeling Research Group, Advanced Analytics Lab and Predictive Modeling School. Connect with him on Twitter.

 

RELATED LINKS

Business model innovation needs to be more of a (data) science

Without a good story, data is just a bunch of numbers

Is the CEO undermining your enterprise’s big data strategy?

Comments

  1. geosupergirl says:

    Having written many a Data strategy (for GIS) over the years. This brings back memories, but also provides an extremely refreshing approach to doing Data Strategy differently and using Machine Learning to make the most of your data and your strategy which should (I believe) be a continually updated document which sits alongside to help keep it in check your data.

    Liked by 1 person

Trackbacks

  1. […] Data strategy gets confused with data governance. When I think of strategy, I think of chess. To play a game of chess, you have to know the rules. To win a game of chess, you have to have a strategy. Knowing that “the D2 pawn can move to D3 unless there is an obstruction at D3 or the move exposes the king to direct attack” is necessary to play the game, but it doesn’t help me pick a winning move. What I really need are patterns that put me in a better position to win—“If I can get my knight and queen connected in the center of the board, I can force my opponent’s king into a trap in the corner.” […]

    Like

  2. […] Data strategy gets confused with data governance. When I think of strategy, I think of chess. To play a game of chess, you have to know the rules. To win a game of chess, you have to have a strategy. Knowing that “the D2 pawn can move to D3 unless there is an obstruction at D3 or the move exposes the king to direct attack” is necessary to play the game, but it doesn’t help me pick a winning move. What I really need are patterns that put me in a better position to win—“If I can get my knight and queen connected in the center of the board, I can force my opponent’s king into a trap in the corner.” […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: