Socializing Insights with end users: Analytics for masses – Amazon vs. LinkedIn

This blog post is about comparison of amazon.com and linkedin.com in terms of similarities across dimensions of analytic maturity & use of data shared by their customers. As Thomas Davenport mentions in his book “Competing with analytics”, amazon.com is one of the few companies which was built on the foundation of data, the so called “Analytically mature” company. LinkedIn has joined the list, with lot of new features available to their users.

              As customers interact with the site, they generate data about their liking towards certain products or feature. Companies like amazon.com and LinkedIn clearly understand how to leverage this information to make the interaction between the customer and the site even more valuable & relevant. Users who are ready to share more data with site about their likes/dislikes, the better would be the site’s recommendation for the user.  The companies need to instil this confidence in the customers mind, and hence have the users share data by will.

               Amazon.com & LinkedIn makes available every little fact about the consumer’s behaviour and interaction with other users or products to help change their behaviour in terms of decision they make to buy or not-buy a product or whether to look for an employer change, etc.

LinkedIn has amazing insights about the companies, profiles which is all available to the users freely. In a interview with linkedIn CEO, Reid Hoffman by Andreas weigend, Reid talks about every individual as a small business and every individual thinks of their reputation in terms of number of new connections, who viewed their profiles, how many times their profile came up in the search results and stats of similar kind.  Andreas Weigend, a social data expert talks behavior change brought about by features like ‘who viewed your profile in the last 15 days’ in end users and in the way companies like LinkedIn treats the users.

(i)                  Insights about companies( lets say we are researching the company mu-sigma):

  • Employee switching patterns between companies. Employees moved from ‘xyz’ to mu-sigma.
  • Employee switching patterns between companies: Employees  moved from mu-sigma to “abc”.
  • Gender distribution: M to F ratio at mu-sigma.
  • By years of experience, how does mu-sigma differ from other companies. Similar statistics is available by job function, educational qualification & university. Similar company benchmark is available for comparison.
  • People who looked at “mu-sigma” also viewed – other list of companies?
  • Where employees of mu-sigma call home
  • Most recommended at mu-sigma.
  • Time trend of employees who got a change in title.

(ii)                Insights about profiles/users

    • Who viewed my profile in the last 15 days?
    • How many times did your profile show up in search results?
    • Recommendation about other profiles/users you  might know.
    • Companies which user might be interested in following.
    • Relevant jobs for every user with functionality to apply for it.
    • Work recommendations by colleagues and customers.
LinkedIn

LinkedIn

    Here’s a look at what amazon.com offers. When purchasing a product at amazon.com, the user would be presented with stats related to

  • How many users who searched for the book “The outliers by Malcolm Gladwell” (say) ended up purchasing it or ended up purchasing “The tipping Point” , “The Blink” , “What the dog saw”, etc.. in the same order. However I feel the need to quantify the same would help. I mean calling out that 80% of people who searched for “A” ended up purchasing “B”. Or 80% of people who searched for “A” ended up purchasing “A”.
  • “Frequently brought together items” for a given product.
  • Review statistics: How many rated 5-star, 4-star and so on, as a bar chart.

We are moving towards an era of socializing data with end users to make every little decision they possibly make is data driven. WordPress, Netflix, glassdoor, etc are some of the other companies geared towards this trend. The intention of collecting data has truly gone beyond marketing purpose.

Market Basket Analysis/Association Rule Mining using R package – arules

In my previous post, i had discussed about Association rule mining in some detail.  Here i have shown the implementation of the concept using open source tool R using the package arules. Market Basket Analysis is a specific application of Association rule mining, where retail transaction baskets are analysed to find the products which are likely to be purchased together. The analysis output forms the input for  recomendation engines/marketing strategies. Read more of this post

Beyond BI & Analytics

For the last 6 months, i have been closely following trends in information management. Below are few of my observations.

  • Data source explosion: Business Problems are gaining complexity day by day, hence there is a huge demand for analyzing data from multitude of sources to help companies frame strategies for growth.  GPS data accumulated by Telecom companies offer insights into customers current location and provide context aware recomendations. Infact, some of the telecom companies have introduced location based pricing. Sensor data helps identify security threats to secure networks. Social network data has opened up as a channel for marketing services/product. Analysis of such closely knit data leads to behavioral & Contextual targeting. Traditional data analysis tools/algorithms fail to perform efficiently because such data are of huge sizes and needs newer datastructures for efficient analysis.
  • Databases going beyond relational is gaining popularity. NoSQL dbs and Graph/Tree/XML based databases.
  • Open Source tools continue to emerge.(R, RapidMiner, Weka)
  • Growing need for massive dataset analysis.
  • Artificial Intelligence(AI) and NLP gaining popularity among data analysts( in additional to ML techniques)
  • Multimedia Analytics: Need for gathering critical metrics like customer footfalls, quantifying customers satisfaction by using facial expressions. All these applications demand high end signal processing( both Image & Video). There is a lot of scope for innovation in this area.
  • Privacy preserving techniques for data analysis. This in turn encourages companies to outsource some of the critical data analysis to third parties.
  • Agile Methodologies for Analytics Project to cope up with rapidly changing customer/business needs.
  • Bio-Inspiration/Bio-Imitation: To learn from nature/natural processes and develop analogous techniques which could potentially solve a real-world problem. Some classic examples are development of Neural network inspired by working of a human brain, solving path optimization problem from Ant colonies, 280 degree view of honey bee(vision) etc.
  • More and more data are made publicly available.
  • Real Time data integration, insight generation and business decision.
  • Complex visualization techniques through new technology like Adobe Flex , MS Silverlight,etc which are known for generating RIA.(Rich Internet Applications)

And I am sure these are just few items in the list and really not exhaustive. Feel free to share your comments.

Datamining Video Lectures – Best way to learn

  Do you find analytics/data mining a difficult topic to understand and learn? To a certain extent true if you were to use books as the source. Friends, i found these two very valuable and high quality source for learning topics related to data mining and above all these are free.

  (i) From David Mease who teaches DM at Google:You can access approximately 11 hours of video(11 parts) on the semester topic “Statistical Aspects of Data Mining” here http://video.google.com/videosearch?q=mease+stats+202&sitesearch=# and also you can get pdf version of lecture slides and assignments, try to solve them and master them. I guess the author has also some blogs to discuss problems in this topic. The best thing about this video tutorial is that David has demonstrated implementation of each of these techniques using open source data mining tool – R (short for Revolution).

Videos: http://video.google.com/videosearch?q=mease+stats+202&sitesearch=#
Lecture notes -pdf : http://www.stats202.com/original_index.html
Course Home: http://www.stats202.com/

  (ii)From Stanford University as Andrew Ng. teaches “Machine learning”: This is another very usefull video course. The semester course is covered in 20 parts and hence approx. 20 hours of quality knowledge. The best thing about Andrew is he teaches the mathematics so good, you start visualizing equations and that is one good way to learn maths. Its not just about maths, he also demonstrates the video demos on Machine learning projects implemented by his students like autonomous car driving, autonomous flying, converting a picture to a 3-d experience,etc…that way you dont get bored anytime during the lecture.I loved it a lot.Hope you enjoy it too.

videos: http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599&search_query=stanford+%2B+machine+learning
Lecture Notes:  http://www.stanford.edu/class/cs229/materials.html
Course Home: http://cs229.stanford.edu/

   I am sure you will find more content than what i have mentioned here. Feel free to explore the course page. I personally believe anything can be learnt best only by first learning its applications,which in process gets you motivated and the rest is assured. I would like to thank Andrew Ng. and David Mease for sharing their expertise. A good initiative by stanford. Expecting more from top educational schools.

Association Rule Mining

Association Rule Mining [ Implementation using R here]

Association Rule mining is one of the classical DM technique. Association Rule mining is a very powerful technique of analysing / finding patterns in the data set. It is a supervised learning technique in the sense that we feed the Association Algorithm with a training data set( as called Experience E in machine learning context) to formulate hypothesis(H) . The input data to a association rule mining algorithm requires a format which will be detailed shortly.
Ok let me first introduce the readers with some of the application areas of this DM technique and motivation for the study of Association analysis. The classic application of the association rule mining is to analyse the Market Basket Data of a retail store. For example, Retail stores like Wal-Mart, Reliance fresh, big bazaar gather data about customer purchase behaviour and they have complete details of the goods purchased as part of a single bill. This is called Market basket data and its analysis is termed “market basket analysis”. Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 52 other followers