According to Data Science Central, some 80% of data warehouse projects fail to achieve their aims. They go on to prescribe a “start small and involve business users” approach to ensure success. While I agree in principle with this approach, there is a fundamental flaw in the way most data warehouse projects are tackled. Let me start this conversation from the beginning.
My background is in finance and operations. I have tackled many IT projects in my career from improving supply chains to creating more accurate cash flow projections. I now spend most of my time advising companies on how to update the systems that they started with a decade or more ago.
The most common problem statement is a variation of: My operations are inefficient because my CRM, order management, accounting, inventory, and PO system don’t talk to each other.
The two recommendations that they are debating between are a data warehouse project or replacing their legacy systems with a new Enterprise Resource Planning (ERP) system. Both of these options are daunting in terms of cost, time, and project risk. This discussion will focus just on data warehouses. I am writing another post discussing ERP strategies.
Single source of truth
The problem that a data warehouse is attempting to solve is to provide a single source of truth for all of the data needed to make a particular business decision.
For example, a customer in a remote location needs to repair one of my machines.
I need to check my spare parts inventory, order materials if needed, send a quote, get a deposit, assemble the parts, ship them, schedule a technician to be on-site, and invoice for the completed job.
Since none of my systems talk to each other, this logistics process is handled manually and is very error-prone. In theory, a data warehouse can fix this problem by collecting all of the data in one place and allowing me to build a process workflow to manage the steps.
Layering complexity on top of complexity
One of the main challenges to achieving a single source of truth is the reality of getting accurate data from distributed systems. Each of the systems that I need to talk to has its own relational database. Some of the systems are on-premises, and some are in the cloud.
If anything goes wrong with a query, which happens all the time, my data warehouse will break. In the real world, for example, a technician will arrive on site before the spare part do. The solution to fix this distributed systems problem is to add more checks and balances to the data warehouse which creates more complexity.
In the end, the system is likely to be more expensive to build and maintain and take more time to build than expected. Even worse, there might only be one person on the planet that can actually understand how the system works. This last factor will limit the extensibility of your data warehouse to tackle other business processes.
An Enterprise Business Framework solves this problem
An Enterprise Business Framework comprises 3 components: Event Adaptors, Business Event Database and Business Logic Managers. It is not attempting to recreate and sychronize all of the data from all of the systems.
In our remote repair example above, the solution would look something like the diagram below:
An Event Adaptor listens to each system for relevant events, for example, parts inventory changes.
It broadcasts this event to the Business Event Database which is storing a ledger of all of the events being created from legacy systems. It can also talk back to that system, for example, tell the inventory system to hold this part for order #xyz.
The Business Logic Manager is used to configure the process and deliver the report
Now that we have all of the required data in one place, we can implement the business processes.
- In our example, it would be along the lines of:
- Listen for a repair order
- Prepare a parts list
- Check inventory
- Send an estimate
- Send a deposit invoice
- Ship parts
- Check courier for parts delivery
- Send technician
- Multiple process managers can run across the same business event database.
Accuracy, extensibility, and simplicity
Because the database is ledger-based, it always has the latest transaction and can provide accurate data for any variable. [expand]
The Event Adaptor, Business Event DB, and Business Logic Manager are easily, inexpensively, and quickly extensible to other business processes. We skipped over one of the main points of failure for a data warehouse project and that is the risk of not fully understanding the business requirements and goals.
Without a clear understanding of what the end users need from the data, the project may not meet expectations and fail to provide value to the organization. Using an Enterprise Process Framework by definition mitigates this risk since the Business Logic Manager is written in plain language by the business owner, not the programmer.
Time and money
Just how much easier and more efficient is an Enterprise Process Framework approach over a data warehouse?
A data warehouse project is typically a 1 to 2 year project costing $1M or more. The initial setup of the Enterprise Process Framework takes about 6 months and costs about a fraction of the cost of a data warehouse.
All of the Event Adaptors, Business Event DB, and Business Logic Managers can be expanded easily and quickly without breaking the system. The maintenance and hosting costs of the framework is a fraction of the cost of a data warehouse due to its simplicity, reduced queries, and compact data footprint.
In conclusion, the failure rate of data warehouse projects is high, with around 80% of such projects failing to achieve their aims. The recommended approach to ensure success is to start small and involve business users, but there is a fundamental flaw in the way most data warehouse projects are tackled.
The main challenge is getting accurate data from distributed systems, which leads to layering complexity on top of complexity. An Enterprise Business Framework comprising Event Adaptors, Business Event Database, and Business Logic Managers can solve this problem by providing a single source of truth for all the data needed to make a particular business decision.
By using a ledger-based database, accurate data can be provided for any variable. Moreover, the framework is easily extensible to other business processes and is inexpensive and quick to set up compared to a data warehouse. Additionally, the Business Logic Manager is written in plain language by the business owner, reducing the risk of not fully understanding the business requirements and goals.
In summary, the Enterprise Business Framework approach can mitigate the risk of data warehouse projects failing while being cost-effective, efficient, and easily extensible.