Thursday, September 26, 2013

Start/Stop BI Services in Unix


Below are the Steps to Start and Stop BI Services in Linux..

Stop BI Services:

1.
cd /obi/MiddlewareHome/instances/instance1/bin

./opmnctl stopall

2.
cd /obi/MiddlewareHome/user_projects/domains/bifoundation_domain/bin

./stopManagedWebLogic.sh bi_server1 t3://tsgvm00046.newbreed.com:7001 < weblogic_username>  < weblogic_password>

3.
cd /obi/MiddlewareHome/user_projects/domains/bifoundation_domain/bin

./stopWebLogic.sh

4.
ps –ef |grep node

Kill –9 <NodeManager ID>

5.
ps –ef |grep node

Kill -9 <weblogic ID



Start BI Services:

1.
cd /obi/MiddlewareHome/wlserver_10.3/server/bin

Nohup sh ./startNodeManager.sh &

Note: NodeManager must be running in 9556 port

2.
cd /obi/MiddlewareHome/user_projects/domains/bifoundation_domain/bin

nohup sh ./startWebLogic.sh -Dweblogic.management.username=<weblogic_username> -Dweblogic.management.password=<weblogic_password> > wls_start.log &

3.
cd /obi/MiddlewareHome/user_projects/domains/bifoundation_domain/bin

./startManagedWebLogic.sh bi_server1 http:// tsgvm00046.newbreed.com:7001

4.
cd /obi/MiddlewareHome/instances/instance1/bin

./opmnctl startall

Sunday, September 22, 2013

Slowly Changing Dimension


Slowly Changing Dimension (SCD)

This applies to cases where the attribute for a record varies over time. We give an example below:
Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record:
Customer Key
Name
State
1001
Christina
Illinois
At a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem.
There are in general three ways to solve this type of problem, and they are categorized as follows:
Type 1: The new record replaces the original record. No trace of the old record exists.
Type 2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two people.
Type 3: The original record is modified to reflect the change.
We next take a look at each of the scenarios and how the data model and the data looks like for each of them. Finally, we compare and contrast among the three alternatives.

Type 1 Slowly Changing Dimension
In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept.
In our example, recall we originally have the following table:
Customer Key
Name
State
1001
Christina
Illinois
After Christina moved from Illinois to California, the new information replaces the new record, and we have the following table:
Customer Key
Name
State
1001
Christina
California
Advantages:
- This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep track of the old information.
Disadvantages:
- All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Christina lived in Illinois before.
Usage:
About 50% of the time.
When to use Type 1:
Type 1 slowly changing dimension should be used when it is not necessary for the data warehouse to keep track of historical changes.

Type 2 Slowly Changing Dimension
In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.
In our example, recall we originally have the following table:
Customer Key
Name
State
1001
Christina
Illinois
After Christina moved from Illinois to California, we add the new information as a new row into the table:
Customer Key
Name
State
1001
Christina
Illinois
1005
Christina
California
Advantages:
- This allows us to accurately keep all historical information.
Disadvantages:
- This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.
- This necessarily complicates the ETL process.
Usage:
About 50% of the time.
When to use Type 2:
Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes.

Type 3 Slowly Changing Dimension
In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active.
In our example, recall we originally have the following table:
Customer Key
Name
State
1001
Christina
Illinois
To accommodate Type 3 Slowly Changing Dimension, we will now have the following columns:
·         Customer  Key
·         Name
·         Original State
·         Current State
·         Effective Date
After Christina moved from Illinois to California, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2003):
Customer Key
Name
Original State
Current State
Effective Date
1001
Christina
Illinois
California
15-JAN-2003
Advantages:
- This does not increase the size of the table, since new information is updated.
- This allows us to keep some part of history.
Disadvantages:
- Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Christina later moves to Texas on December 15, 2003, the California information will be lost.
Usage:
Type 3 is rarely used in actual practice.
When to use Type 3:

Type III slowly changing dimension should only be used when it is necessary for the data warehouse to track historical changes, and when such changes will only occur for a finite number of time. 

Monday, September 16, 2013

Surrogate Key


Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table. 

Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key.

Reasons for choosing Surrogate key (SK) are:

•If we replace the Natural Key (NK) with a single Integer, it should be able to save a substantial amount of storage space. The SKs of different Dimensions would be stored as Foreign Keys (FK) in the Fact tables to maintain Referential Integrity (RI), and here instead of storing of those big or huge NKs, storing of concise SKs would result in less amount of space needed. The UNIQUE indexes built on the SK will take less space than the UNIQUE index built on the NK which may be alphanumeric.

•Replacing big, ugly NKs and composite keys with beautiful, tight integer SKs is bound to improve join performance, since joining two Integer columns works faster. So, it provides an extra edge in the ETL performance by fastening data retrieval and lookup.

•Advantage of a four-byte integer key is that it can represent more than 2 billion different values, which would be enough for any dimension and SK would not run out of values, not even for the Big or Monster Dimension.

•SK is usually independent of the data contained in the record, we cannot understand anything about the data in a record simply by seeing only the SK. Hence it provides Data Abstraction.
Historical Data: SK helps in maintaining historical data. Suppose there us a department table in OLTP system, which has dept_id and bu_id as NK (natural primary key or composite key) and lets sat there is department number as primary key in OLTP.

If we do not create SK when this department data is populated through ETL into DW and department number is used in DW also as primary key. Then let’s say if a particular department is moved from one BU1 to another BU2 in OLTP system then the records in fact tables in DW will show as they are in department which is in BU2 even historical records will appear as if they were always in BU2.

But if we maintain a SK in DW then when department belonged to BU1 then leys say the SK assigned was Dept_SK1. And fact table also has Dept_SK1 as foreign key; now when department is assigned to BU2 in OLTP and when this comes to DW this is treated as new department and new SK is created, like Dept_SK2 (for same department and different BU i.e. BU2); And fact table will have new foreign key Dept_SK2. So fact table will have records for Dept_SK1 and Dept_SK2.