Operational BI can only be successful if data meets quality expectations and is understood by the business community
Operational business intelligence shares many characteristics with traditionalBI, but it also differs in many ways, the most dramatic of which is the timeliness of the data acquisition and integration process. Traditional BI can often rely on overnight or intraday batch processing for collecting and processing the data. To meet operational BI needs, the update cycles repeatedly require more frequent processing of the data and do not allow for a batch processing cycle. This has several implications with respect to ensuring data quality, two of which are governance/data stewardship and source data quality.
Governance and Data Stewardship
Best practices for a BI project dictate effective governance structures as well as a robust data stewardship program. While this may the best practice, many companies have BI programs that deliver value yet do not have adequate governance or stewardship. (I don't condone that approach.) To understand how this can happen, we need to examine the impact of governance and stewardship on both the project and the result. The first of these two impacts applies equally to both traditional and operational BI initiatives. The second is more problematic for operational BI projects.
•Absence of effective leadership impacts the project by lengthening the time required to reach an understanding of the data definitions, business rules and quality expectations. While this is painful to the project team, once the agreement is reached, appropriate logic can be developed to correctly bring the data into the data warehouse.
•Even without effective governance and stewardship, once the business rules for migrating data are established, (batch) extract, transform and load policies can be developed to address data quality deficiencies. This is not always the case for operational BI. If the data needs to be loaded on a near real-time basis, error correction logic often cannot be incorporated into the data movement code. There simply isn't time to do the error correction, and often the data required to perform the correction (e.g., reference data) is not available at the same time that the transactions are being processed. To alleviate this problem, the source systems and business processes must be adjusted to prevent the errors from occurring within the data. Changes there are well beyond the scope and authority of the data warehouse team. Strong leadership (i.e., governance and stewardship) is required to determine, implement and enforce whatever changes are needed. Without strong support, the data sources will not be adjusted and the data quality deficiencies will propagate into the operational BI environment.
Source Data Quality
As previously explained, errors in the source data at the source must be addressed for operational BI to succeed. But how do we know the condition of the data?
The condition of the source data is analyzed using data profiling (a.k.a. source data analysis). Data profiling provides a systematic way of examining the source data to identify quality deficiencies, which would either impede the data acquisition and aggregation processes or generate erroneous or misleading BI results. Both strategic and operational BI development methodologies include data profiling. The difference lies in the options that can be pursued.
With traditional BI, errors that are found in the data can be corrected as part of the ETL process. This is possible due to the nature of the ETL jobs (batch) and their frequency (often daily). For any errors detected during data profiling, the project team could opt to correct the data within the ETL process.
With operational BI, there may not be an ETL process. Depending on the desired data latency, data cleansing logic within the data capture and integration is limited. For these applications, at least some of the errors detected during the data profiling need to be addressed within the source system environment, and the source system may need to be enhanced to prevent erroneous data from being stored. This requires the data profiling process to include thorough root-cause analysis.
Success in an operational BI environment requires people to trust the results they receive, and that is only accomplished if the data meets quality expectations and is understood by the business community.
Two of the ways the operational BI environment differs from the traditional BI environment are the criticality of effective governance and data stewardship and of the data profiling work.
Business intelligence
Business intelligence (BI) has been referred to as the process of making better decisions through the use of people, processes, data and related tools and methodologies. The roots of business intelligence are found in relational databases, data warehouses and data marts that help organize historical information in the hands of business analysts to generate reporting that informs executives and senior departmental managers of strategic and tactical trends and opportunities.
In recent years, business intelligence has also come to rely on near real-time operational data found in systems including enterprise resource planning (ERP), customer relationship management (CRM), supply chain, marketing and other databases. “Operational” BI is meant to provision many more functions in the organization with role-specific dashboards and scorecards and is increasingly tied to the topics of performance management and business process management. Inherent to any form of BI is the notion of data quality, consistent and dependable data and the processes involved in its creation and maintenance.
Data Quality
If there is a single pitfall that undermines any given data management initiative, it is most likely to be found in the realm of data quality, a requirement for sound decision-making. Whether in combination or by themselves, databases are almost certain to contain entry errors, multiple common entries and other redundancies that inevitably lead to incorrect or incomplete identifications of customers, products and locations. Thus, data quality is a critical prerequisite to any BI initiative that would otherwise skew or obfuscate meaning in the reporting and analytic outputs of databases, reporting tools, dashboards and scorecards.
No comments:
Post a Comment