ARCHITECTURE FOR REAL ESTATE PLATFORMBack Case Studies
Client: EU-based Real Estate Platform
Project Duration: 3 months
Goal: Build a real estate platform that provides information on all real estates in the country with a personalized property recommendation engine. Provide targeted recommendations to users in real time based on their interest and user profile.
Tech: Apache Kafka, Kafka Streams, Apache Spark (Streaming and MLib), Apache Cassandra, ElasticSearch
The current product provided recommendations based on clustering of the properties where user profile and search history had no impact on the recommendation. It proved to be sufficient enough for a certain period of time but moving forward was inevitable. Since the competition started doing something similar, our client needed to move to a better suitable architecture to provide more precise recommendations to their users.
We needed to create an infrastructure that will provide storage for millions of properties throughout the country but also collect user data and clickstream to be able to build a user profile. Property information was collected from various sources (government API, scraping competitors websites, public APIs and proprietary data sources) and it had to be structured so that it can be processed.
With all requirements in mind, we decided to base our data pipeline on Apache Kafka. Kafka Streams are used to execute light data processing where Apache Spark solved processing historical data and real time data to build user profiles and execute recommendation algorithms. With Apache Cassandra as our main storage we were able to leverage time series capabilities but also integration with Apache Spark. All raw and processed data required on the UI is pushed into ElasticSearch so that it can be indexed and freely queried.
Building a better recommendation engine was one of the goals of this engagement and it created a gap on the market between our client and the competition. With user profiles giving more context to the searches executed on the system, the client was able to provide a better context to its data. This proved to be a good business decision since they extended their business by also selling their data.