Case study

Trading & Generation Data Platform for a Tier 1 Energy Company

Company

Tier 1 Energy Company

Industry

Energy & Utilities

Challenge

Complex ETL across multiple data sources with multi-stage deployment requirements

Impact

40+ data objects deployed to production with real-time and batch ingestion

A Tier 1 Australian energy company needed to build a robust data platform for their trading and generation division. The project required ingesting and transforming data from multiple source systems, including Infoserver, EDS, and BDS, into a unified, governed data environment that could support trading decisions and generation operations.

SISU Solutions placed a senior data engineer to lead the technical delivery, navigating complex ETL processes across diverse data objects, managing schema changes across environments, and building the automation needed to deliver reliably at scale.

The challenge

Complexity at every layer of the stack

The project demanded implementing complex ETL processes for a diverse mix of data objects including both tables and views from multiple source systems. Each source had its own schema conventions, data types, and update frequencies, requiring careful mapping and transformation logic to unify into a consistent target model.

Managing schema changes and data type conversions across multiple environments (UAT, preproduction, and production) added significant coordination complexity. Each deployment needed to be validated independently, and slowly changing dimension (SCD) type validation had to be implemented to ensure historical data integrity.

The integration of real-time market data from Thomson Reuters alongside batch data from internal systems required an event-driven architecture that could handle both patterns reliably. Performance optimisation for large-scale data processing was critical, as trading decisions depended on timely, accurate data availability.

The solution

Modular, metadata-driven engineering

The SISU team developed modular transformation logic using SQL and Databricks, with a metadata-driven approach powered by YAML configuration files. This pattern allowed the team to define transformation rules declaratively, making it straightforward to add new data objects without writing bespoke code for each one.

Azure DevOps was used for version control and CI/CD pipeline management, with AWS S3 handling artifact and metadata storage. Databricks provided the data processing engine, with Delta Lake enabling versioned storage that supported both auditing and rollback capabilities. Custom DDL runners were built for environment-specific deployments, ensuring consistency across the promotion path.

Data validation and unit testing processes were embedded into the pipeline, catching issues before they reached production. An event-driven architecture was implemented for Thomson Reuters data ingestion, ensuring real-time market data flowed through to the trading team with minimal latency. Monitoring and alerting capabilities were built in from the start, giving operations visibility into pipeline health and data freshness.

Results
40+
Data objects in production
3
Source systems unified
Real-time
Market data ingestion via event-driven architecture

Explore another case study to see how we deliver results, or get in touch to discuss your project.