BigDW: Slowly changing dimensions in Hive

Tuesday, January 24, 2012

Slowly changing dimensions in Hive

If you keep you Datawarehouse in Hive,How to implement slowly changing dimensions?

http://en.wikipedia.org/wiki/Slowly_changing_dimension

This is not a solution blog, actually Im looking for an effective solution to implement SCD in Hive.

Since all the types uses Update one way or another. for eg Type 2 can be a good candidate but it requires to update the enddate of history record. The only option I can think off is to insert the rows into the dimension and query on the maxid or maxdate to get the latest dimensions on the analytics layer.. (seems like an overhead on mapreduce)

I posted this question to Cloudera professionals group at LinkedIn. And Jasper recommended an idea which is not using Hive queries, moreover he prefers mapreduce scripts to handle SCD situations. I modified the flow to suit Hive tables.

Idea is this

1. Open the underlying HDFS file or select all rows using Hive-ql

2. Pass the data through a mapper as key,value pair

3. End-date the old record and leave the new record with NULL end_date (for type2) through script

4. Overwrite the Hive table

But the process could be tricky if the data is present in different partitions.

4 comments:

AnonymousMay 25, 2012 at 3:49 AM
that maybe helps.
http://afancy-lxf.blogspot.com/2012/03/process-type-1-scd-in-hive.html
ReplyDelete
Replies
AnonymousJune 5, 2013 at 11:35 PM
Hi, could you provide the link for Jasper's M/R scripts idea?
We are currently trying to implement type 1 updates in Hive and are seeing huge performance problems with big fact tables, so hopefully this can help.

Thanks!
ReplyDelete
Replies
UnknownJuly 5, 2014 at 7:14 PM
Please refer the following link that explains with examples schema on read concept for implementing hive based SCD type 2 functionality
http://amintor.com/1/post/2014/07/implement-scd-type-2-in-hadoop-using-hive-transforms.html
ReplyDelete
Replies
selinAugust 30, 2023 at 12:01 AM
https://saglamproxy.com
metin2 proxy
proxy satın al
knight online proxy
mobil proxy satın al
FQM05
ReplyDelete
Replies

Add comment