Senior Databricks Enginee...
Korn Ferry, City of Westminster
- Full time
- Contract
Apply on company site
Full Stack Developer
CACI, City of Westminster
- Full time
- Contract
Apply on company site
Automation Engineer
Korn Ferry, City of Westminster
- Full time
- Contract
Apply on company site
Managed File Transfer Arc...
GIOS Technology, City of Westminster
- Full time
- Contract
Apply on company site
DV Cleared Data Scientist
Korn Ferry, City of Westminster
- Full time
- Contract
Apply on company site
DV cleared Data Analyst
Korn Ferry, City of Westminster
- Full time
- Contract
Apply on company site
Solution Architect
scrumconnect ltd, City of Westminster
- Full time
- Contract
Apply on company site
Data Steward/Data Scienti...
Emagine Consulting, City of Westminster
- Full time
- Contract
Apply on company site
Senior Databricks Engineer - Data Streaming & Optimization (Remote Contract)
Salary not available. View on company website.
Korn Ferry, City of Westminster
- Remote working
- Full time
- Contract
Posted 2 days ago, 18 Jul
Job ref: 2709c0e004784ed8bccf6a356514b75c
Full Job Description
Duration: 6 months (initial), with possible extension. Location: Remote/Hybrid (depending on candidates requirements) Rate: Flexible (inside IR35) We are seeking an experienced Databricks Engineer with deep expertise in data streaming pipelines, performance tuning, and cost optimization. You will work on an enterprise-scale crude and product data platform that processes large volumes of Real Time transactional data from on-prem Mainframe systems into Azure cloud. The current implementation leverages Databricks Delta Live Tables (DLT) for raw data ingestion and transformation. Key Responsibilities
- Pipeline Diagnostics & Reliability + Investigate and resolve data latency and inconsistency issues in streaming pipelines. + Troubleshoot root causes for intermittent DLT failures and optimize for stability.
- Performance Optimization + Tune Spark and Databricks configurations to reduce memory overflows and job crashes. + Improve cluster utilization, optimize job execution plans, and introduce partitioning/caching strategies.
- Cost Optimization + Analyze current compute cost drivers and propose strategies for efficient scaling. + Implement autoscaling, job parameterization, and cluster right-sizing to meet budget targets.
- Architecture Enhancement + Ensure the raw data layer is reliable and performant to support future silver and gold layers. + Collaborate with architects to define best practices for Delta Live Tables and streaming design patterns.
- Operational Excellence
- Introduce monitoring, alerting, and observability for data pipeline health.
- Build trust in the platform by delivering predictable and consistent performance. Required Skills & Experience
- Deep experience with Databricks (including Delta Live Tables, Delta Lake, and Streaming Jobs).
- Strong Spark tuning expertise for performance and memory optimization.
- Solid knowledge of Azure Data Lake Storage (ADLS) and related Azure services.
- Experience with data ingestion from on-prem systems and CDC tools (eg, Qlik Replicate).
- Proven ability to diagnose and resolve latency, reliability, and scalability issues in large-scale pipelines.
- Proficient in cluster sizing, autoscaling, and cost optimization strategies.
- Familiarity with data architecture principles for bronze, silver, and gold layers.
Deep experience with Databricks (including Delta Live Tables, Delta Lake, and Streaming Jobs). - Strong Spark tuning expertise for performance and memory optimization.
- Solid knowledge of Azure Data Lake Storage (ADLS) and related Azure services.
- Experience with data ingestion from on-prem systems and CDC tools (eg, Qlik Replicate).
- Proven ability to diagnose and resolve latency, reliability, and scalability issues in large-scale pipelines.
- Proficient in cluster sizing, autoscaling, and cost optimization strategies.
- Familiarity with data architecture principles for bronze, silver, and gold layers.