Learn Analytics using a business case study : Part I
Best way to learn analytics is through
experience and solving case studies. Here, I will present you a complete
business model and take you through a step by step process of how
analytics is set up in a new business, how is it used in daily processes
and some of the advanced analytics techniques which a business can use
to make meaningful segmentation and prediction to optimize its
marketing & sales campaign.
Background :
You have recently started a Video CD
rent shop. After 2 months you realize that there is tough competition in
the market and you need to make a more customer centric strategy to
stand out in the market. Hence, you want to collect the most granular
details of your customer behavior and build strategy accordingly.
Business case layout :
This business case have been broken down
into 3 articles. Following is a plot of the articles and each article
will be strongly dependent on findings in the previous articles:
1. How do you collect data so as to capture all the important information?
2. Deep dive into customer behavior and
using basic data analysis with business knowledge to optimize daily
operations
3. How do you use data with advanced analytics to make your marketing/sales startegies more targeted?
Case study part I :
Did you ever wonder why do you deal with
so many datamarts in your company. Let’s try to understand as the owner
of the busienss what all data sources do you need.
1. Transactions Table :
You rent out Video CDs and the most
important data for you will be transactional data. Transactional data is
by far the richest data throughout all industries. Each row in
transactional data corresponds to one transaction made. This
transaction mostly are monetary transaction. To identify each
transaction, you need a distinct transaction code associated with each
transaction. What other fields can you think of to be captured along
with each transaction. Following is a small list of such variables :
1. Transaction ID
2. Customer ID : Identifying the customer to whom you have rented out the CD
3. Rent due : How much does the customer need to pay as rent
4. Issue date : When was the movie rented out
5. Recieved date : When was the movie recieved. Blank if CD is still due
6. Movie ID : Identifying the movie
Following is a sample transactional data set :
2. Product Table :
If you have transaction table, you
basically have the linkage between the customers and the products. But
why does transaction table not have the discription of products? The
simplest reason for the same is that total number of products are
limited in any industry, and the same product is repeated throughout the
transactions table. If we add description in every single line, it adds
enormously to the overall size of transaction table, which anyway is
huge. Hence, we keep the products table seperate and merge it with
required transactions for specific analysis.
Product table is unique on product id,
which maps to transactions table. What other parameters can you think of
that make sense for you to include? Following is a list of possible
variables :
1. Movie ID : Unique ID of movie
2. Origin yearmonth : When was the CD bought?
3. Genre of movie
4. Language of movie
5. Star Cast of movie
6. Movie name
Note : Product
ID generally can be decoded to know product details. For example, here H
denotes “Hindi” and E denotes “English”. This coding makes the analysis
simpler.
3. Customer Table :
The other hand of transaction table is
the customer table. Using the above two tables, you almost have
everything except the details of the customer. While making any kind of
customer centric strategy, its very essential to consider the customer
profile.This table helps you find the customer profile. This table is
unique on customer id. What other parameters can you think of that make
sense for you to include? Following is a list of possible variables :
1. Customer ID : Unique ID of customer
2. Age
3. Gender
4. Area : Area where the customer lives
5. Package : Package customer has enrolled to
6. Enrol_date : When did the customer make his first transaction
7. Name
Note : Similar to Product ID, Customer ID also generally can be decoded to know customer details.
4. Engagement Tracker :
All the three tables together can be
used to create any kind of analysis to build marketing and sales
strategy. What they do not cover is the engagement you had with your
customers till date. Say, I called Kunal 1 week back to tell him about a
movie X. Now, it might not be the best idea to call Kunal again this
week to tell about the same movie. Hence, we need to keep a track on all
kinds of engagement we have with our customer on daily basis. This is
similar to transactions table but this include all the non-monetary
interactions we have with out customers till date. These interactions
can be inbound or outbound. This table is unique on engagement_id. What
other parameters can you think of that make sense for you to
include? Following is a list of possible variables :
1. Engagement id : Unique engagement identification
2. Inbound flag : 1 if inbound, 0 if outbound
3. Type of engagement : Code for the type of engagement
4. Product ID : ID of product in question
5. Derived tables :
Because the data sizes become huge with
time, it is always recommended to keep some monthly snapshots handy. One
of such table can be transactions data rolled up at customer level.
Following is a list of such possible variables :
1. Customer ID
2. # lifetime transactions : No. of transactions made by customer till date
3. Enrol date
4. # English Movies : No. of English movies taken by customer till date
5. Last engement date
6. Last transaction date
Such derived tables come very handy to
make quick analysis. Say, you have acquired 10 new english movies and
want to market them. You might want to market these movies to customers
who watch english movies, who responded to recent engagements and who
have done recent transactions. For such a targeting list, imagine the
process you might need to follow. Following is a possible way to achieve
the same :
Imagine how easy this analysis gets if you have the derived monthly snapshot handy.
Graph schemas:
The article till now focuses on use of
traditional relational databases. Graph based databases (e.g. Neo4j) are
a strong alternate to these traditional databases. They add a lot of
flexibility to your database, where you can change the schema very
easily.
This kind of flexibility is required in
case your data formats can change and you can not have much control on
it. Also, you can add new structures and relationships very quickly.
Before we go in these details, a typical graph schema in this case would
look something like:
Blue nodes represent customers, Red
represent movies and Green represents various package available. Every
edge is a relationship in between nodes. For example, if a customer
rents out a movie, we can draw an edge between the 2.
Now by calculating things like number of
edges from a node, you can look at things like most active customer,
most rented and least rented movies. You can also start looking at what
kind of customers are renting what kind of movies.
P.S. Like all data
model designs, there are various alternates to this design and you
should choose the best depending on your usage.
End Notes :
We discussed relational database and
graph database for representing a typical business problem. The data
tables we discussed in this article is almost parallel to datamarts in
any industry. We will look at some interesting strategies which can be
derived using these data sources for the CD rental business case. Some
of these strategies which are very basic in nature and needs more of
business sense than modelling will be discussed in the next article.
This will make you understand how effective strategies can be built if
you mix business knowledge with simple data analysis.Knowledge of data
is very essential regardless of the industry you work for.
Did you find the article useful? Share
with us any other problem statements you can think of. Do let us know
your thoughts about this article in the box below.
No comments:
Post a Comment