A data mart is a
simple form of a data warehouse that is focused on a single subject (or
functional area), such as Sales, Finance, or Marketing. Data marts are
often built and controlled by a single department within an
organization. Given their single-subject focus, data marts usually draw
data from only a few sources. The sources could be internal operational
systems, a central data warehouse, or external data.
Another definition of Data Mart:
A
data mart is a particular subject oriented block of data in the data
warehouse in the business line like production, sales, marketing etc.
they are two kinds of data marts are there, one is Independent &
Dependent data marts.
What is Metadata?Metadata is information about the data. For a data mart, metadata includes:
01. A description of the data in business terms
02. Format and definition of the data in system term
03. Data sources and frequency of refreshing data
How Is It Different from a Data Warehouse?
A
data warehouse, unlike a data mart, deals with multiple subject areas
and is typically implemented and controlled by a central organizational
unit such as the corporate Information Technology (IT) group. Often, it
is called a central or enterprise data warehouse. Typically, a data
warehouse assembles data from multiple source systems.
Dependent and Independent Data Marts:
There
are two basic types of data marts: dependent and independent. The
categorization is based primarily on the data source that feeds the data
mart. Dependent data marts draw data from a central data warehouse that
has already been created. Independent data marts, in contrast, are
standalone systems built by drawing data directly from operational or
external sources of data, or both.
The
main difference between independent and dependent data marts is how you
populate the data mart; that is, how you get data out of the sources
and into the data mart. This step, called the
Extraction-Transformation-and Loading (ETL) process, involves moving
data from operational systems, filtering it, and loading it into the
data mart.
With dependent
data marts, this process is somewhat simplified because formatted and
summarized (clean) data has already been loaded into the central data
warehouse. The ETL process for dependent data marts is mostly a process
of identifying the right subset of data relevant to the chosen data mart
subject and moving a copy of it, perhaps in a summarized form.
With
independent data marts, however, you must deal with all aspects of the
ETL process, much as you do with a central data warehouse. The number of
sources is likely to be fewer and the amount of data associated with
the data mart is less than the warehouse, given your focus on a single
subject.
Implementing Steps of a Data mart:
Steps are:
Designing
Constructing
Populating
Accessing
Managing
Designing:
The
design step is first in the data mart process. This step covers all of
the tasks from initiating the request for a data mart through gathering
information about the requirements, and developing the logical and
physical design of the data mart.
The design step involves the following tasks:01. Gathering the business and technical requirements.
02. Identifying data sources.
03. Selecting the appropriate subset of data.
04. Designing the logical and physical structure of the data mart.
Constructing:
This step includes creating the physical database and the logical structures associated with the data mart to provide fast and efficient access to the data.
This step involves the following tasks:
01. Creating the physical database and storage structures, such as tablespaces, associated with the data mart.
02. Creating the schema objects, such as tables and indexes defined in the design step.
03. Determining how best to set up the tables and the access structures.
Populating:
The
populating step covers all of the tasks related to getting the data
from the source, cleaning it up, modifying it to the right format and
level of detail, and moving it into the data mart.
The populating step involves the following tasks:
01. Mapping data sources to target data structures.
02. Extracting data.
03. Cleansing and transforming the data.
04. Loading data into the data mart.
05. Creating and storing metadata.
Accessing:
The
accessing step involves putting the data to use: querying the data,
analyzing it, creating reports, charts, and graphs, and publishing
these. Typically, the end user uses a graphical front-end tool to submit
queries to the database and display the results of the queries.
The accessing step requires that you perform the following tasks:
01. Set up an intermediate layer for the front-end tool to use. This layer, the metalayer, translates database structures and object names into business terms, so that the end user can interact with the data mart using terms that relate to the business function.
02. Maintain and manage these business interfaces.
03. Set up and manage database structures, like summarized tables that help queries submitted through the front-end tool execute quickly and efficiently.
Managing:
This step involves managing the data mart over its lifetime. In this step, you perform management tasks such as the following:
01. Providing secure access to the data.
02. Managing the growth of the data.
03. Optimizing the system for better performance.
04. Ensuring the availability of data even with system failures.
No comments:
Post a Comment