Lesson 1: Creating the Project and Basic Package
In this lesson, you will create a simple ETL package that extracts data from a single flat file source, transforms the data using two lookup transformation components, and writes that data to the FactCurrencyRate fact table in AdventureWorksDW. As part of this lesson, you will learn how to create new packages, add and configure data source and destination connections, and work with new control flow and data flow components.
Important
This tutorial requires the AdventureWorksDW sample database. For more information on installing and deploying AdventureWorksDW, see Getting Started with SQL Server Samples and Sample Databases.
Understanding the Package Requirements
Before creating a package, you need a good understanding of the formatting used in both the source data and the destination. Once you understand both of these data formats, you will be ready to define the transformations necessary to map the source data to the destination.
Looking at the Source
For this tutorial, the source data is a set of historical currency data contained in the flat file, SampleCurrencyData.txt. The source data has the following four columns: the average rate of the currency, a currency key, a date key, and the end-of-day rate.
Here is an example of the source data contained in the SampleCurrencyData.txt file:
1.00010001 ARS 9/3/2001 0:00 0.99960016
1.00010001 ARS 9/4/2001 0:00 1.001001001
1.00020004 ARS 9/5/2001 0:00 0.99990001
1.00020004 ARS 9/6/2001 0:00 1.00040016
1.00050025 ARS 9/7/2001 0:00 0.99990001
1.00050025 ARS 9/8/2001 0:00 1.001001001
1.00050025 ARS 9/9/2001 0:00 1
1.00010001 ARS 9/10/2001 0:00 1.00040016
1.00020004 ARS 9/11/2001 0:00 0.99990001
1.00020004 ARS 9/12/2001 0:00 1.001101211
When working with flat file source data, it is important to understand how the Flat File connection manager interprets the flat file data. If the flat file source is Unicode, the Flat File connection manager defines all columns as [DT_WSTR] with a default column width of 50. If the flat file source is ANSI-encoded, the columns are defined as [DT_STR] with a column width of 50. You will probably have to change these defaults to make the string column types more appropriate for your data. To do this, you will need to look at the data type of the destination where the data will be written to and then choose the correct type within the Flat File connection manager.
Looking at the Destination
The ultimate destination for the source data is the FactCurrencyRate fact table in AdventureWorksDW. The FactCurrencyRate fact table has four columns, and has relationships to two dimension tables, as shown in the following table.
Column Name |
Data Type |
Lookup Table |
Lookup Column |
---|---|---|---|
AverageRate |
float |
None |
None |
CurrencyKey |
int (FK) |
DimCurrency |
CurrencyKey (PK) |
TimeKey |
Int (FK) |
DimTime |
TimeKey (PK) |
EndOfDayRate |
float |
None |
None |
Mapping Source Data to be Compatible with the Destination
Analysis of the source and destination data formats indicates that lookups will be necessary for the CurrencyKey and TimeKey values. The transformations that will perform these lookups will obtain the CurrencyKey and TimeKey values by using the alternate keys from DimCurrency and DimTime dimension tables.
Flat File Column |
Table Name |
Column Name |
Data Type |
---|---|---|---|
0 |
FactCurrencyRate |
AverageRate |
Float |
1 |
DimCurrency |
CurrencyAlternateKey |
nchar (3) |
2 |
DimTime |
FullDateAlternateKey |
Datetime |
3 |
FactCurrencyRate |
EndOfDayRate |
Float |
Lesson Tasks
This lesson contains the following tasks: