My Data Life

The first thing I did to develop this post recommender is to come up with some KPI’s so as to have a measure of success after we put it into production. The data analyst I work with, Kim, was really helpful getting me started with Periscope and helping to setup some queries for our KPI dashboards. We came up with several metrics, the most important being average click-through rate for related posts. Hopefully we’ll see some significant lift in this section after the engine is online.

Dashboards, SQL, KPI's: Oh My!

I’ve mentioned this before, but I didn’t fully grasp how important and useful SQL is for data analysis, storage, and retrieval until I had to use it everyday for everything I want to do. PostgreSQL is so convenient. Using the Postico SQL client is pretty straightforward, and I’m learning to do so many more things with SQL than I ever even knew were possible. Storing complex data in JSON format is probably my new favorite trick, since it’s almost like combining the best parts of SQL and NoSQL databases.

After figuring out which KPI’s made the most sense as far as metrics go, we (the data and analytics department) discussed what sorts of data we wanted to use to make recommendations. For the first iteration, we decided to use some sort of NLP technique for entity extraction to link posts by topic (sort of similar to topic modeling, but slightly different). In the future, we’d like to have the option to give more weight to posts related to sponsored content, to take into account user preferences, to have time-sensitive recommendations (like for film festivals or upcoming movies), and much, much more. For now, we start simply by relating posts according to these entities that we’ll extract using Dandelion’s Entity Extraction Api to give us our similarity vectors. Let the recommending begin!