In this talk, I discuss how Hadoop helps us to process over a billion possible matches into several highly compatible matches for each of our users per day. |
Image may be NSFW. Clik here to view. ![]() |
eHarmony was founded to give people a better chance at finding happy, passionate, and fulfilling relationships. Did you know that we are already responsible for 5% of all new US marriages, and that more than 600,000 people met their spouses on eHarmony?
During this talk I describe how we go about creating highly compatible matches, and how we leverage Big Data technologies to accomplish that goal.
Specifically, I discuss how we take a billion+ potential matches that we find through MongoDB, store them in a Voldemort NoSQL datastore, and then run multiple Hadoop jobs to come up with a filtered list based on Machine Learned models.
Our Hadoop clusters are in-house, high density, low power Seamicro installations, and we use Spring Batch and Spring Data Hadoop to orchestrate the Hadoop jobs.
View the slides here…