(101 Introduction To Social Analytics+ 201 Boolean Mastery+301 Data Analysis And Insights Generation)

1671% Completed

301.1 Overview

Please watch the video and read the lesson below.

Before we dive into the specifics of Data Analytics and Insights Generation, let’s take a step back and review what was covered in the A-Z in Social Analytics.


301.1.1 Objective Setting

Before we can start on anything, the first step is understanding the entire objective of the project. Some of the common objectives include:

  • Problem Solving
  • Process Optimization
  • Goal Attainment

Once the main objectives of the project is set up, the next step will be to outline the key results or measures that will dictate success and guide your campaign. These key results should be a lot more measurable, qualified by the projection that achieving these key results should bring you closer to the objective.


Earned Media Paid Media Owned Media
Like Share Comment
Brand Positioning
Brand Loyalty


301.1.2 Research Design

Objective and key results from the study would greatly influence the research design – which requires different preparation and platform configuration in order to achieve the desired output. While some may require additional desk research as a precursor to understand the landscape (e.g. Industry and Market Trendspotting, Consumer and Audience Insights), others may only require a level of familiarity with your own business functions before setting the scope of research.

Remember, these are not mutually exclusive and many times organizations require a couple of different researches to answer their business questions.

Diagram below shows the typical research design across the different phases of exploration and measurement:



301.1.3 Boolean Configuration

Once the planning is done, it is now time to get in action with the social listening tool by informing the system of your scope of research using Booleans. This section will introduce you to the basic operators which you can use in our digital intelligence platform Radarr. (Please note that other tools could be using different formats of these operators, but the logic will remain applicable).

1. AND

AND is used if you want to search for posts containing two or more terms all present in the same body of text

Example: You want to see mentions of exercise related to health

Boolean query: exercise AND health



NOT is used if you want to exclude words that are unrelated to the topic you are searching for

Example: You want to search for mentions of bears but not including grizzly bears

Boolean query: bears NOT grizzly



OR will return mentions that have either or both of the search terms

Example: You want to search for mentions about people’s salary/income

Boolean query: salary OR income


4.Double Quotes “ “

This is called “string operator” and is used to check for the exact match of the text. It will show posts if the search terms enclosed in quotation marks are adjacent to each other in the text.

Example: You want to search for mentions of the term black jacket

Boolean query: “black jacket”


5.Parentheses ()

This is used to group search terms and will be helpful when creating complex Boolean queries

Example: You want to search for mentions of Android phones excluding Samsung phones

Boolean query: (android AND phone) NOT Samsung


301.1.4 Data Optimization

Below is the general guide for cleaning up social data to ensure the relevance of research result:


  1. Eyeballing the dataset to sieve out dirty data by narrowing down on the irrelevant themes and conversations. A quick way to assess prominent themes is by using the Word Cloud – if there are emerging keywords that are evidently unrelated it may be a sign to dig deeper and investigate its source.


       2.Creating a boolean string that is specific to the dirty data, ensuring that keywords selected are              distinct from the keywords that are mentioned within relevant data.


       3.Applying exclusions to eliminate dirty data from the dataset. This can be done by appending                the “NOT” function to the existing boolean configuration, followed by the boolean string of                            keywords relevant to the dirty data as done previously.


For instance, if you wish to look at consumer conversations around iPhone:

((Apple OR “#Apple” OR “@AppleSG”) AND (iPhone OR “i-phone” OR “i phone”)) NOT (iPad OR Airpods OR Macbook)

        4.Reviewing the dataset to ensure that the dirty data has been removed, repeat the process if                     necessary until you are satisfied with the integrity of the dataset.


301.1.5 Data Analysis vs Insights Generation

Now that we have our data collected and optimized, it’s finally time to dig deep and uncover what lies within the online conversations that we are interested in. But before we get into that, it is crucial to understand how the different types of analysis works – both in the context of conventional dataset as well as social media dataset.


Discovering the unknown Substantiating the known
Uses words Uses numbers
Concerned with meanings Concerned with behaviour
Induces hypothesis from data Begins with hypothesis
Case studies Generalisation

While the type of analysis is typically dictated by the format of data source (e.g. ratings and ranking, vs open-ended questions), the definition of analysis for social media data is less distinct and is highly dependent on the initial objective and research question. Here are some key features of social media data that sets it apart from conventional data analysis:


1.High level of noise. While conventional data follows a set format in its data collection phase (e.g. multiple choice questions from surveys, discussion content from focus groups, or performance data from web analytics), there is no set format that social media data fits into. Everyone speaks freely using lingos and jargons befitting the generation, which makes it difficult to synthesize fixed structures for analysis.


2.Multiple layers from a single datapoint. A single post can be dissected through multiple lenses depending on the kind of information that the analyst is interested in.  Be it the relevant keyword count, sentiment, perception, or even the local nuances – the same dataset is able to provide a great level of depth depending on how it is segmented.


3.Representativeness. Unlike conventional data where respondents are providing direct information about their preferences and behavior, it is important to keep in mind that online users rarely put up content with the intention for research – hence the tight regulation from authorities to protect user privacy. While this limits the exhaustiveness of conversations around certain topics that are being tracked, it usually does not interfere with the credibility of insights – so long as the objective of study stays within the scope of social analytics.