Each year, more and more data is collected than what was collected in the prior year. So what?
The amount of data being put together whether through questionnaires, statistical methods, surveys, online tracking and the internet generally is astronomical. But most of it just sits there not being used. The abundance of data we have does not automatically transform into abundance of information.
Data analytics is the process of turning data into a format that can be consumed.
It all begins with defining a purpose to data to be collected. Once that is established, raw data is then standardized, put together from all the different sources, looked at, explored, cleaned, divided into separate components, compared and examined so that outliers, errors, useless and useful components are taken note of.
That’s a lot of steps, but all of this is done so that useful information can be discovered for the sake of answering questions with that data. Data analytics is at the heart of data driven decisions and actionable insights.
What is the main purpose of data analysis?
The main purpose of data analysis is to discover useful, useable information from rows of data that lends itself to understand business performance, aid in business decisions, and help form your business strategy.
When anyone sets about the task of data gathering and collection, there is an end goal in mind- a question to be answered, hypothesis to be tested, decisions to be made or accuracy of decisions made to be weighed- and we plough through over-abundant gathered data in order to find these answers.
The purpose of the research could be to understand the data, to gain insight from data, to predict possible outcomes or to use data to prescribe decisions that are likely to guarantee the best possible outcomes.
In light of this, let us look at the different types of data analysis.
What are the types of data analysis?
A small side note before we dive into it; a debate exists regarding whether data analysis and analytics must be distinguished and treated separately or they can be used interchangeably. It is believed that data analysis is about asking the how and why of a research question.
When we go beyond the how and why answered in data analysis and go on to answering questions predicting future trends or making recommendations regarding future events, we have entered the realm of data analytics. Many however use the two terms interchangeably although it is believed that this is because of a lack of a distinct understanding of the individual concepts.
To put it simply, analysis is past and analytics is future.
In this article, however, we discuss the different types of data analysis as one fluid concept. However, read here if you are more interested in how data analytics helps businesses.
Exploratory Data Analysis
Exploratory data analysis (EDA) is usually the first step in data analysis.
It allows you to investigate, examine and go through the data to learn about it. In carrying out exploratory data analysis, you are trying to understand what the data is all about and making a discovery of the data.
Through this part of the analytics process, you will likely encounter data entry errors, missing data, anomalies or outliers in order to treat, update or separate them from your data set. Having an understanding of what you are trying to solve for will guide how to handle each data quirk.
Patterns and connections also emerge in the course of exploratory data analysis, and you can start to gain some initial insight at this stage of the analysis process.
The process of exploratory data analysis often involves data mining and the use of data visualization tools.
At this stage, you break your data into components of categorical and numeric and start looking at graphs. Bar charts and line charts are widely known, but a lot of value can be derived by breaking out additional graphs to visually explore the data. A sampling of additional graphs to explore depending on the data available are as follows:
- box plot
- scatter plot
- pareto chart
- parallel coordinates
- to visually explore the data.
With a solid understanding of the data through this exploratory data analysis, you can move to evaluate for some basis of informed decisions, transform, model and test hypothesis.
All that said, I have seen this step skipped way more times than I care to share. The analytics software at your disposal may make you feel like you can just throw data at a model & go. Complex statistical algorithms are being incorporated into the tools without an analyst actually feeling like they have the technical skills to understand what is happening behind the scenes.
I once came into a project that had 20% of their records reflecting a customer’s age of over 120 years old. And alive.
Bad data, anyone?
They just didn’t know because they didn’t look.
This is a nifty place to get started. Have an Excel spreadsheet? Just download Python for free and get to analyzing!
As the name suggests, descriptive analysis simply describes data. This is essential to data analysis and is necessary to find the meaning of the data collected and draw conclusions upon that data set. Descriptive data analysis is basically the ‘what is’ of the data.
In descriptive data analysis, you do not go beyond the data. What you are doing is interpreting and presenting the meaning in the data before you.
Descriptive analytics is a look into performance using historical information.
You may also hear this called business intelligence, but the way terms get thrown deserves it’s own article to understand the difference types of analytic terms!
When carrying out descriptive data analysis, you answer your question through measures of frequency, central tendency, dispersion or variation, and position of your data set.
Key performance indicators can be helpful to make sure that the overwhelm of data in this space is kept focused on business goals, and to keep things digestible, read this to understand what data visualization is and why it matters.
This is a big part of the decision-making process on the basis of data. After data analysis has been done to understand the why and how, the patterns, variables, causes and effects observed in current and historical data are used to make data-educated guesses about possible future outcomes, trends, problems and the like.
You are also able to identify potential implications of actions, potential risks, and opportunities and reduce risk by understanding what might happen going forward.
Predictive analysis is used in weather forecast, retail, online shopping, internet search suggestions, scientific research and many more diverse fields. It is very important in business to drive decisions in manufacturing, marketing campaigns, operational processes, expansions, etc.
Predictive data analysis uses statistical models, machine learning techniques and AI to enhance and automate continuously gathered data to answer the question of what will happen.
In most cases, data is not gathered just to explore it, understand what it is and predict what may happen in the face of different factors and variables. There is, more often than not, the need on the basis of the information gathered from data, to decide what to do. This is where prescriptive data analysis comes in.
This type of data analysis is very essential to having a culture of data driven decision-making.
You are able, with prescriptive data analysis, to pre-empt and create practical solutions and prescribe the best course of action to issues as and before they emerge.
Understanding the Types of Analytics Through an Example
Let us create an imaginary university that offers a course on the study of the effects of climate change. This course was created after a rally held by a group of graduating students. Over the course of five years a total of three hundred students have attended the course.
In the first year there were a hundred students. In the second year, there were eighty students. In the third there were fifty students. In the fourth year, there were fifty students and in the fifth year, there were twenty students.
Exploratory data analysis would show us the steady decrease in the number of students showing a pattern of loss of interest in the course in relation to the graduation of the group that held the rally.
Descriptive data analysis would show us the frequency of the decrease and the percentage of the decrease of students in the class.
Predictive data analysis would look at the historical and current data to predict that the class may be completely scraped due to lack of students by the following year.
Using the information from the exploration of the data, it may be suggested that a climate change club ought to be created or a seminar to talk about the importance of climate change on the environment ought to be had in order to build a buzz and increase interest in the subject.
This is only a very simple example, but I hope it gives some clarity.
Methods of Data Analysis Examples
Let us do a quick run through of some of the approach taken when performing any of the above discussed data analysis.
Cluster analysis is a method of data analysis within the realm of exploratory data analysis that involves separating or grouping similar data into a set in order to find similarities between each pair of subjects.
Cohort analysis involves grouping data with shared characteristics before analyzing them in order to see emerging patterns.
Regression analysis is a statistical analysis that measures the relation between the mean value of one dependent variable and corresponding values of other independent variables.
Neural networks is a method metaphorically modelled after the human brain and nervous system which uses techniques based on Artificial Intelligence and Machine Learning for data mining and predictive data analysis.
Factor analysis is done to understand the existence of a relationship or interdependency between a set of variables. This may be used in prescriptive data analysis to make recommendation that result in reduction of variables.
Data mining is an exploratory analysis of already existing large data sets in order to extract and discover previously unknown knowledge and patterns.
Text analysis is used to, extract, describe, analyze and structure large volumes of unstructured text into machine- readable, meaningful and understandable data.
Data Analytics Tools
Manually processing large amounts of data can be time consuming, tedious and it also increases the risk of errors.
There are tools and skills that are beneficial to learn in order to perform the very important data analysis tasks.
A solid standby for querying data is some form of SQL. SAS, R, and Python are great languages to learn that will help you take an analysis from the exploratory through descriptive and onward to advanced analytics and even data science, as well.
Reporting, analytics and data visualization tools like Tableau, Power BI and Looker are incredibly useful no matter where you are in the data analytics maturity.
This is a kind of beginner’s guide to data analysis. Each piece reviewed here is just that. A review.
It’s a broad spectrum of information that you would need to dive into and understand much more in depth in order to be well on your way to understanding data analysis beyond the surface.
Here are some articles to read if you find yourself needing more information to get started.