Building the Dataset.

  • Data Collection

    Does the data you’ve collected match the machine learning task and problem you have defined?

    • Find and collect data related to the problem.
  • Data Inspection
    • Identify outliers.
    • Is there any missing or incomplete data.
    • Is any transformation needed ?
  • Summary Statistics
    • With many statistical tools, you can calculate things like the mean, inner-quartile range (IQR), and standard deviation.
    • Can identify trends, scale, shape of data.
  • Data Vis

Impute: tools which can be used to calculate the missing values from your dataset, also the outlier values.