Thursday, March 29, 2012

Data science -- the cool scientist without white gowns

"Scientist" is a cool word during my school days, wanna become one but donno on what. All I see as scientists wore white gown with colorful liquids around. But later during college and working days scientists seemed to be boring people with no personal life, accumulated in educational institutes with college kids helping around.

Recently there is a profile called "Data scientist" all over the BI market and started appearing in every article where Hadoop / Big data is there. It must be some part what a BI/DW person is doing with some specialization. Yes it is the formula,as for me..

BI + big-data + statistics + scripting + visualization = Data scientist

Ok can be scientist and work for a corporate or invent something new for you name? possibly...
But seems like a lot to cover , learn and experience at work. Maybe not really if we are in right job. Im just listing the very higher level outline..(and it is not limited to..) and my intention is not to oversimplify, but certainly to simplify the puzzle..

BI
- DW work like ETL , databases, SQL with exposure to enterprise setup. Collecting data from heterogenous data sources. Log analysis. Dimensional modelling, DW architecture. DB performance.

Big-data
Hadoop is the first thing comes to mind for Big-data.. but good to know about noSQL dbs.
Mapreduce - shared nothing architecture - need for MR - use cases - tools available - pros and cons

Statistics
Basics - application of statistics in real-world - R programming

Scripting
Perl, Python, Java

Visualization
Reporting (I like Tableau), Complex SQLs , Ability to tell a story with data - by whatever way you effectively deliver..

I would like to list some of the coolest learning materials available for above topics..

I believe the thirst for discovery, admiring the hidden secret in the boring pile of data would make a Data Scientist..









No comments:

Post a Comment