Follow this process of Data Analytics project !!

Follow this process of Data Analytics project !!

Follow this process of Data Analytics project !!

Did you liked it ??
+1
0
+1
0
+1
0
+1
0

Introduction

Till now we have learnt about the introduction and saw the importance of Business and Data Analytics. Also we saw how we can select a proper BI tool and various factors of it.

This article marks an inflection point where we will learn how to solve business problems using data analysis. Analytics problem solving involved multiple steps like data cleaning, preparation, modelling, model evaluation etc. The structure used for solving an analytics problem is called as CRISP-DM framework which is known as Cross Industry Standard Process for Data Mining.

As a data analytics professional, you will face many challenges ranging from understanding various business problems to choosing the best technique to solve it. To avoid getting lost, data professionals have developed a robust process to solve virtually any analytics problem in any industry using CRISP-DM framework.

The flow of the framework is shown in below figure –

It involves a series of steps which are quite interesting

  1. Business understanding
  2. Data understanding
  3. Data Preparation
  4. Data Modelling
  5. Model Evaluation
  6. Model Deployment

Lets try to understand each step in a proper manner.

Business Understanding

We now have a framework to solve about various problems, but where do we exactly start? Do we directly go on data? Or do we ask some fundamental questions to understand the problem better?

Imagine you are in going to picnic and your car stop suddenly. You have your toolbox and you want to repair your car. To do so, you need to know first what exactly have gone wrong.

For a data professional, understanding the business is its specific problem is the most important. If you understand the problem clearly you can convert it into a well defined analytics problem. If you understand that business problem, only then you can lay out the brilliant strategy to solve it.

If you don’t understand the business and and jump directly to solve it then your strategies may definitely go wrong.

To understand the business problems, one has to undertake the following steps :

  • Determine your business objectives clearly.
  • Determine the goal of data analysis.

Data Understanding

After business understanding the next important step is data understanding. When you get your hand on the data for the first time, you would want to know the structure of your data (number of files, rows, columns, etc.), understand how are they related to each other and whether something look weird like negative values, outliers, etc. This step is also crucial because when you undertsnad your data properly you can perform further steps more effectively.

Data understanding may include following steps :

  • The type of data sets that are available for analysis.
  • The information you can get from the datasets.
  • Exploring your data and understanding the depth.
  • Performing quality check on the data sets.

Data Preparation

Across various data analytics project, data analysts spend almost 50% to 80% of time on data cleaning and preparation, and therefore data preparation becomes one of the most crucial steps.

Data is vast and are in various files. Collecting all the required data from the files togther and selecting the required columns and rows based on business understanding is a major step in data preparation. After data collection we have to deal with missing values and outliers in the data. Outliers can heavily effect the data and if not treated it can also effect your insights. It is considered as the most important step because the model will be built on the data sets created by in this step.

So some steps which include in data preparation are :

  • Select relevant Data
  • Integrate Data
  • Clean Data
  • Construct Data : Derive new features
  • Format Data

Data Modelling

The Data Modelling is called as “Heart of Data Analysis”. One can think model as a magical box which takes relevant data as input and gives output you are interested in.

In Data Modelling, various Machine Learning and Deep Learning algorithms are used to make data models to answer your question.

We will study about Data Modelling further in our articles.

Model Evaluation and Deployment

In data analytics, evaluation is when you put everything you have done to litmus tests. If the results obtained from model evaluation is not satisfactory and you re-create the whole process. If the model performs well and gives you accurate results then your data modelling process is successful.

Evaluation is necessary to ensure that your model is robust and effective. Once your evaluation is successful you can further deploy your model on various platforms like cloud, local platforms, software’s, etc.

Conclusion

One of the interesting feature of CRISP-DM framework is that the whole process is iterative in nature. This completes the typical life cycle of a data analytics project.

Did you liked it ??
+1
0
+1
0
+1
0
+1
0

Leave a Reply

Your email address will not be published. Required fields are marked *