Get Started. It's Free
or sign up with your email address
Rocket clouds
DataScience by Mind Map: DataScience

1. 1.Business Problem/ Business_Object

2. Project Charter

2.1. High level details of the Project

3. 2.Data Collection

3.1. Primary Research

3.1.1. Organizational Documents

3.2. Secondary Research

3.2.1. Search through Internet

3.3. Experimentation / Survey

4. Data Types

4.1. Structured

4.1.1. Numerical Continuous (Qualitative) Interval Ratio Descrete (Quantitative) Count cant represented with decimals

4.1.2. Catagorical Binary Which have Only two values Ex : True or False, Right or Wrong, Default or not, Yes or not etc . > 2 Catagories Having more than two values Ordinal Nominal

4.2. Un Structured

4.2.1. Multimedia files etc which doesn't have any structure .

4.2.2. We need to give a structure to this Data

5. EDA(Exploratory Data Analysis

5.1. 4 - Moments of Business Decisions

5.1.1. 1st Moment Measures of Central Tendencies Mean Median Mode

5.1.2. 2nd Moment Dispersion of Data Variance Standard deviation Range

5.1.3. 3rd Moment Skewness Asymmetry in Probability Distribution Positive/Right skewed - Longer tale on the right side Negative/Left skewed - Longer tale on the left side If both tales are equal it will be a normal distribution hence (skewness value = 0)

5.1.4. 4th Moment Kurtosis Sharper / Heavier tales - Positive kurtosis Broader / Lighter tales - Negative kurtosis

5.2. Graphical Representations

5.2.1. Barplot No business inferences can be drawn

5.2.2. Histogram Shape of Probability Distribution Normal Perfect Bell shaped curve, symmetric on both sides of central tendencies

5.2.3. Boxplot Identify Outliers Lower Extreme - Min value after removing the Outliers. Lower Quartile - Q1 Median Upper Quartile - Q3 Upper extreme - Max value after removing outliers. (IQR = Q3 - Q1) Middle most 50% of the data Upper Fence = Q3+1.5(IQR) Lower Fence = Q1-1.5(IQR) UF = Q3 + 3(IQR) LF = Q1 - 3(IQR)

6. Random Variable

6.1. If Each possible outcome of a variable associated with PB , RV.

6.2. Random variables always represented with 'X' and 'Y'.

7. Probability

8. Probability Distribution

8.1. Plot b/w random variable and its corresponding probabilities

8.2. Continuous PD

8.2.1. Smooth curve

8.3. Discrete PD

8.3.1. Bars