Wednesday, May 2, 2018

DBSync Cloud Replication Tool for Salesforce






The DBSync Cloud Replication for Salesforce tool is used for synchronizing the data between Salesforce and Database. The Salesforce would be on cloud. The Database could be on cloud or on premises.




Some of the main application of this tool are
·         Building Data Warehouse for Reporting or for Data Mining
·         Data Migration
·         Offline Data Cleansing and may be subsequent synchronization (We can bring the data from Salesforce to offsite database…. Cleans it…. and/or perform manipulations and load it back to Salesforce)

DBSync Cloud Replication can be invoked from
·         Web interface
·         Command line
·         Shell
·         Windows Scheduler
·         Any ETL tool




DBSync Features:

The application creates the objects in the target DB that do not already exist.
The objects can be replicated in batch or in real-time
The objects in Salesforce can be updated from the Data Warehouse (bi-directional)


Monday, February 5, 2018

Hive


I am newbie in Hive and Big Data. Writing some basic hive commands for quick reference-



--================Show Databases===============
hive> show databases;
--================Create Database==============
hive> create database hive_db;
--================Drop Database================
hive> drop database hive_db;

--================Create Table=================
hive> create table emp_hive (empid int, empname string);

hive> show tables;
OK
emp_hive Time taken:" 0.115 seconds Fetched: 2 row(s)
--================Rename A Table===============

hive> ALTER table emp_hive RENAME to emp_hive_New;

hive> show tables:
OK
emp_hive_New
--================Drop table======================
hive> drop table emp_hive_New;

--================Table types and usage=================
Hive has two types of structures like Internal and External tables depending on loading and design of schema in Hive

Internal Tables: 
- First we create the table then load the data.
- This is "data on schema"
- By dropping this table, both data and schema will be removed
- the stored location of this table will be at /user/hive/warehouse

When to choose Internal Table ->
- If the the processing data is available in local file system
- If we want Hive to manage the complete lifecycle of data including the deletion

Create Internal Table:
hive> CREATE TABLE hive_internal_table (id INT, Name STRING);
Row format delimited
Fields terminated by '\t';

Load the Data into internal table 
hive> LOAD DATA INPATH '/user/arora_hive/data.txt ' INTO table hive_internal_table;

Display t he content of the table

hive> select * from hive_internal_table;

Drop the internal table
hive> DROP TABLE hive_internal_table;

when we drop the table, its data is also deleted from Hive


External Tables:
- Data will be available in HDFS. The table is going to create on HDFS data.
- Its schema on Data.
- When we drop the table, we drop the schema only. The data still remains in HDFS.

When to Choose External Table:
- If the processing data is available in HDFS.
- When the files are being used outside of HIVE.

Create External Table:
hive> CREATE EXTERNAL TABLE hive_external_table (id INT, Name string)
Row format delimited
Fields terminated by '\t'
LOCATION '/user/arora_hive/hive_external_table';

If we do not specify the location at the time of table creation, we can load the data manually-

hive> LOAD DATA INPATH '/user/arora_hive/data.txt' INTO TABLE hive_external_table;

Display the content of the table
hive> select * from hive_external_table;

To drop the external table-
hive> DROP TABLE hive_external_table;

--=========Difference in Internal and External Tables
Feature Internal External

Schema Data on Schema Schema on Data
Storage Location /usr/hive/warehouse HDFS LOCATION
Data Availability Within Local File System Within HDFS



What is the difference in Order by and Sort by? Distributed by and cluster by?

Friday, January 13, 2017

Row Level Security in Tableau (Simple and Org Hierarchy Based)



Row Level Security requires setup both in Tableau workbooks and in database.


To be secure and perform well, it is recommended  to use a Live Connection to the database. Extracts may be too large if we join a security table that is one-to-many to the data table.


Lets consider the Excel Sheet that comes with Tableau-> Sample - Superstore Sales (Excel).


Consider Data is "Orders" worksheet .


Now lets say we want user to see the data pertaining to his own region.

For example User-A will see data for Nunavut
User-B will see data for User-B

Create a worksheet that has mapping for Region and respective User. (If we have RDBMS then we will have similar mapping table. )



Now open the workbook -> Pull the tables (or worksheets in this case); make a join betwenk Order and User Mapping.






Open the Worksheet we we want to create our report. Create a calculated column as shown in below screen-




Place the calculated column in the Filter and Select True.




Now when User-A logs in then he will see only data for Region Nunavut; and User-B will for Atlantic.

--=========================================================---
We can have following tables also as security tables - 

Users Table
Roles Table
Users-Roles-Mapping Table

And then the Data Table (or Fact table will have the Roles instead on individual users) and that Role column can be joined to Users-Roles-Mapping Table.

--===================Row Level Security Based on Organization Hierarchy========


(Refer to workbook here-> 
https://public.tableau.com/profile/vikas.arora.bidw#!/vizhome/RegionalSalesProfit-RowLevelSecurityonHierarchy/Dashboard1

)

A more practical and real life scenario would be implementation of Row-Level Security based on Organizational Hierarchy where a boss can see all the data (records) created by himself or by his sub-ordinates.

Lets consider a scenario->







Les add a excel worksheet having following data (which is having employee details and their org path-






This will be kind of Employee table maintained by may in HR database. Notice the Org Path Desc the column that we will use for implementing security based on org hierarchy  The column login ID is the one which LDAP Id.


Now lest see the a column Emp ID that has been added in Orders worksheet, this indicates the row created by the employee id.


As per the data in Empl ID- following is the mapping ->
Atlantic-> All records by Emp ID 1010 (Karen)
Northwest Territories-> All records by Emp ID 1010 (Karen)
Nunavut-> All records by Emp ID 1006  (varsha)
Ontario-> All records by Emp ID 1008  (monika)
Prarie ->  All records by Emp ID 1008 (monika)
Quebec ->All records by Emp ID 1003 (vaurora)
West->  All records by Emp ID 1003 (vaurora)
Yukon->All records by Emp ID 1009  (david)

Lets see the workbook->

Create a Calculated Field -> RowBasedSecurityFilter


Place the field in Filter Shelf and select True.


So now the for different Regions appear based on the org hierarchy.

Monday, January 9, 2017

ODI 12c Standalone Agent Configuration


This involves three  steps:

1) Creating  a Domain for Standalone Agent
2) Creating Physical Agent in ODI Studio
3) Starting the Agent

1) Creating a Domain for Standalone Agent



Start FM Configuration Wizard:
On Unix machine, navigate to

/u01/Middleware/ODI12c/oracle_common/bin




Execute ./config.sh





If you do not yet have a WLS domain, then the first one created will be called base_domain by default. You can rename it to something such as odi_agent or mydomain99, just make sure that it remains in $FMW_HOME/user_projects/domains/. If you already have a domain named base_domain, then you cannot "create" another one, you would have to rename it to base_domain2 or something else, or you could Update an existing domain.

Click Next




Select Oracle Data Integrator - Standalone Agent. It is standalone in the sense that it does not use WLS.




Select as shown, Click Next.




Select Manual Configuration. Click Next





Enter the parameters as shown.
(Change hostname to the server where you installed the Master and Work Repository.)
(Mention passwords for schemas PROD_STB and PROD_ODI_REPO; these schemas are in the database where you installed the Master and Work Repository)

Click Next.





Click Next.






Click Next.






Leave these to default.
The is the password of ODI user that we use to connect from ODI Studio.






Note down the PORT and Password, we will need these while starting Agent.
Click Next.






Mention password as odi_agent
Click Next




Click Next.





Click Next.




Click Finish.


2) Create a Physical Agent in ODI Studio

Open ODI Studio







Enter Wallet Password.





Enter ODI Username and Password

On Topology Manager, Right Click Agent and Select New Agent









Enter details as we mentioned during domain configuration



Physical Agent OracleDIAgent1 appears under Agent tree.


Save.


3) Start the Agent

On Unix machine, Navigate to
/u01/Middleware/ODI12c/user_projects/domains/base_domain/bin




Execute ./agent -NAME=OracleDIAgent1 -PORT=20910








Agent is started.

Reference: 
http://www.oracle.com/webfolder/technetwork/tutorials/obe/fmw/odi/odi_12c/odi12c_agent/odi12c_agent.html#section1s1

Friday, January 6, 2017

Planning your Tableau Deployment



Planning your Tableau  deployment->

1) How will you license your installation?
2) How will users authenticate to Tableau Server?
3) How will Tableau Server access data sources?
4) What hardware will you need?

5) High Availability Architecture/Failover Architecture

Friday, September 16, 2016

OBIEE 12c Installation on Windows



Pre-requisite

a) Oracle Database installation (version 11.2.0.4 or higher)
B) JDK Installation


Steps Involved

a) Install Weblogic Server 12c
b) Install OBIEE 12c
c) Create BI Schemas using RCU
d) Set Environment Variables
e) Create BI Domain






Download OBIEE 12c installation files (and Weblogic Server file) from OTN)









Unzip the folders




Download Weblogic server and unzip









Unzip ->


Below .jar file is inside the folder

Install Weblogic Server 12c

Open command prompt as Administrator




Navigate to location where above .jar file exist
i.e. cd C:\vikasarora\Software\OBIEE12C\fmw_12.2.1.0.0_infrastructure_Disk1_1of1

Execute as below
C:\vikasarora\Software\OBIEE12C\fmw_12.2.1.0.0_infrastructure_Disk1_1of1>"C:\Jav
a\jdk1.8.0_121\bin\Java.exe" -jar fmw_12.2.1.0.0_infrastructure.jar








































Click Finish.




Install OBIEE 12c

We download two zip files from OTN




Unzip both the zip files and place them in a separate folder-




If you do not place files in a separate folder you my get error
"Could not extract the archive since it is corrupted. Error code=1"


Run the application as Admin

















Specify Oracle Home.










Click Install.









Click Finish.

Create BI Schema using RCU utility




Right Click RCU.bat and Run as Administrator.














(I got this warning, my database is 12c version)


I ignored.



Change the selections as shown above.

























Set Environment Variable for BI






Create a BI Domain




Go to <MW_HOME>bi\bin folder and run config.cmd




















Select use Existing schema