Data Warehouse Elements to Evaluate
David Heise
14-Oct-98

This document lists some of the elements to consider in a Data Warehouse project. Some of them are components of the data warehouse itself, others are part of the broader project. They all cost money, hence they all need to be evaluated, to decide if we even need them or can afford them, and to evaluate the alternatives. There are suites that include many of the components of a data warehouse, but they are not listed separately in this report. They should be evaluated on the strengths of the components in the suite. Integration and single-vendor solution aspects should be included in the overall evaluation.

Contents

1. Conferences
2. Consultants
3. Methodologies
4. Design Tools
5. Metadata Repositories
6. Databases
7. Replication – transfer, cleanse, transform, schedule
8. Ad hoc queries, reports
9. OLAP/Multidimensional data analysis, decision support
10. Data mining/Statistics
11. Decision Analysis
12. Data Mart Suites

Visit Larry Greenfield’s Data Warehousing Information Center
http://www.dwinfocenter.org/
(used to be: http://pwp.starnetinc.com/larryg/index.html)

1. Conferences

    http://www.dwinfocenter.org/confer.html

2. Consultants

    http://www.dwinfocenter.org/consults.html

    Our consultant list

3. Methodologies

    All the wisdom says your chances of success with a methodology are slim. Larry’s page does not have this listed separately. You would either teach yourself a methodology from a book, learn one (superficially) at a seminar, or buy one through hiring a consultant.

4. Design Tools

http://www.dwinfocenter.org/datamod.html

Our design tool list

Features

  1. Must have dimensional modeling intelligence
  2. Must be able to share the metadata repository
  3. Must be able to generate DDL for building the data warehouse

5. Metadata Repositories

http://www.dwinfocenter.org/catalog.html

Features

  1. Should facilitate and accommodate change management
  2. Should be a single, integrated repository, or at least be able to be shared across components
  3. Should be designed for end users, being easy to navigate, and giving them information that is useful and relevant

6. Databases

http://www.dwinfocenter.org/database.html

Features

  1. The engine should have features that support star joins, multi-table joins over large tables, etc, such as bit-mapped indexes, query optimizers.

7. Replication – transfer, cleanse, transform, schedule

http://www.dwinfocenter.org/clean.html

Features

  1. Should be easy to use, yet powerful and flexible, to make efficient and effective use of support staff time
  2. Should include tools for easing the burden of making changes
  3. Should be self-documenting

This is a "buy, don’t build" category. Bill Inmon estimates that 80% of the efforts of building a data warehouse is spent on these tasks. The point is all too soon reached where the developers become maintenance programmers, further development slows down and ultimately ceases.

8. Ad hoc queries, reports

http://www.dwinfocenter.org/query.html

Features

  1. Support staff should be able to group functionally related items together in catalogs.
  2. The tool should support a range of sophistication and requirements in users.
  3. There should be a web interface
  4. Report definitions should be able to be saved for later re-use
  5. Report definitions should be able to shared easily, including those developed by IT for use by others
  6. There should be administration features that facilitate the deployment of updated catalogs and report definitions

9. OLAP/Multidimensional data analysis, decision support

http://www.dwinfocenter.org/olap.html

Features

  1. All of the features of reporting tools also apply here
  2. Data analysis models, such as for retention, market segment analysis, etc can be developed by IT, and set up with automated extraction/refresh and distribution/notification.
  3. Has an intuitive user interface that facilitates data visualization, presents the data in business model terms rather than transaction oriented terms.
  4. Supports a full range data analysis approaches (drill-down, -through, etc, slice and dice, pivot, what-if). Allows user to search for answers to the "why" question.

10. Data mining/Statistics

http://www.dwinfocenter.org/datamine.html

Larry Greenfield lists the following categories of data mining tools:

He also has a separate listing of statistical tools
http://pwp.starnetinc.com/larryg/statisti.html

11. Decision Analysis

http://pwp.starnetinc.com/larryg/decision.html

"Someone working with a decision analysis tool decomposes a problem into multiple series of decisions that could be made. Ultimately, the aim of these tools is to tell the user the probability of different outcomes and the expected cost of the outcomes."

12. Data Mart Suites

Our list of suites

Suites include all the basic components for building and using a data mart.  There may or may not be an adequate level of integration between the various components.  In particular, a single integrated metadata repository is still not available.


Last Modified: Monday, September 2, 2002 11:52 AM