Case Studies


Back Case Studies

Client: EDI challenge proposed by Virtual Power Solutions SA.
Domain: Energy & Environment
Project Duration: 3+ months (ongoing)
Goal: HVAC usage management and control optimization
Tech: Kubernetes, Docker, Kafka, Spark, Python


More than 30,000 measurement points are monitored at the moment. Typical building installations include the monitoring of income and other circuits, such as HVAC (heating, ventilation, air conditioning) and lighting.

Approximately 15% of VPS measurement points are related to HVAC circuits with remote on/off control. These circuits represent a major part of total energy consumption and, for that reason, there is an opportunity to use data mining and advance control techniques to reduce waste without compromising comfort and thus realize direct savings. Additionally, the volume of electrical power under management offers load balancing opportunities in the VPP context.

As a challenge, VPS proposes the development of a model, using available historical consumption data and other attributes like location, to identify and predict consumption patterns with the following objectives: (1) HVAC consumption forecasts; (2) Anomalous consumption detection; (3) Manage shedding events.


SmartCat is developing Optimus Power, a solution for “HVAC usage management and optimal control”, using the dataset from VPS data provider as well as additional open datasets and state-of-the-art reinforcement learning approaches.


We are proposing a solution architecture as shown in Figure 1. Data comes as streaming time series of measurements from data provider’s smart meters in 15-minute intervals, through their cloud services, while weather and holiday data is collected from open APIs. For data ingestion, we use Apache Kafka with a rich set of connectors for data ingress, like HTTP, MQTT and various DB connectors, which enables us to collect data directly from sensors, through cloud services endpoints or from databases. Apache Kafka therefore ensures flexibility, but also scalability, since it can process millions of messages per second and support a large number of consumers. On the other hand, we have Apache Spark, a processing engine that provides fast and reliable streaming processing in the same fashion as batch processing for (re)training machine learning models. For storing processed time series data, we will use Apache Cassandra to complete this fault-tolerant data flow. The solution is supported by Apache Airflow for scheduling, python APIs and web-apps to support model scoring and dashboards.

Next post