ETL (Extract, Transform, Load) process of preparing a collection of data from the operational source for data. This process consists of extracting, transforming, landing and some process that is performed before being published in the Data Warehouse. The purpose of the ETL (Extract, Transform, Load) is to collect, filter, manipulate and combine the relevant data from various sources for the Data stored in Warehouse. ETL can also be used to integrate data with the existing system.
Results from process ETL (Extract, Transform, Load) is it generates data that meets the criteria Data Warehouse such as historical data, integrated, encapsulated, static and has a structure that is designed for the purposes of the analysis process.
Stages of the process of ETL (Extract, Transform, Load)
ETL process consists of three stages, namely.
Extract
The first step of the process of ETL (Extract, Transform, Load) is the process of the withdrawal of the data from one or more operational system as a data source (can be extracted from the system's OLTP, but could also be from data sources outside of the system database). Most Data Warehouse projects combining data from different sources. In fact, the process of extraction is the process of breaking down and cleaning of data extracted to get a pattern or structure of the desired data.
Transform
The process of cleaning up the data that has been taken in the process of extract data so that it corresponds to the structure of the Data Warehouse or Data Mart. Things that can be done on the stage of transformation.
Load
Phase load on the ETL (Extract, Transform, Load) is the phase function for entering data into the end target, i.e. into a Data Warehouse. Time and range to replace or supplement the data depend on the design of the data warehouse at the time of analyzing information needs. The load phase interacts with a database, the database schema is defined in constraint as a trigger that is activated at the time did load the data as an example of integrity, uniqueness, mandatory fields, which also contribute to the overall look and quality of the data from the process of ETL (Extract, Transform, Load).
Results from process ETL (Extract, Transform, Load) is it generates data that meets the criteria Data Warehouse such as historical data, integrated, encapsulated, static and has a structure that is designed for the purposes of the analysis process.
Stages of the process of ETL (Extract, Transform, Load)
ETL process consists of three stages, namely.
Extract
The first step of the process of ETL (Extract, Transform, Load) is the process of the withdrawal of the data from one or more operational system as a data source (can be extracted from the system's OLTP, but could also be from data sources outside of the system database). Most Data Warehouse projects combining data from different sources. In fact, the process of extraction is the process of breaking down and cleaning of data extracted to get a pattern or structure of the desired data.
Transform
The process of cleaning up the data that has been taken in the process of extract data so that it corresponds to the structure of the Data Warehouse or Data Mart. Things that can be done on the stage of transformation.
- Just choose only certain columns to be inserted into the Data Warehouse.
- Translate value in the form of the code as the database source save 1 for male and 2 for female, but the Data Warehouse stores the M for male and F for female. A process called automated data cleansing, there is no cleanup manually during the process of ETL.
- Encode the values into the free form.
- Calculating new values.
- Combining data from different sources together.
- Create a summary of a set of data rows.
- The difficulties that occur in the transformation process is the data should be merged from several separate systems to be cleared so that consistent and should be aggregating to speed up the analysis.
Phase load on the ETL (Extract, Transform, Load) is the phase function for entering data into the end target, i.e. into a Data Warehouse. Time and range to replace or supplement the data depend on the design of the data warehouse at the time of analyzing information needs. The load phase interacts with a database, the database schema is defined in constraint as a trigger that is activated at the time did load the data as an example of integrity, uniqueness, mandatory fields, which also contribute to the overall look and quality of the data from the process of ETL (Extract, Transform, Load).
Advertisement
No comments