DevOps Lead

Seeing Machines

Full Time

80 Mildura St, Fyshwick ACT 2609, Australia

5 months ago

Apply now

Main Purpose of Role

The DevOps Lead ensures the reliability, scalability, and performance of critical systems and services. This role bridges development and operations, fostering a culture of automation, resilience, and continuous improvement. The manager leads a team of SREs to apply best practices, manage incidents, and drive operational excellence.

Qualifications, Skills and Experience:

Bachelor’s degree in Computer Science, Engineering, or related field
Proven experience in SRE or DevOps leadership roles
Strong knowledge of:
- Cloud platforms (AWS, Azure, GCP)
- Container orchestration (Kubernetes, Docker)
- Infrastructure automation (Terraform, Ansible, Jenkins, Lava)
Expertise in programming languages (Python, Java)
Proficiency with source control systems (GitHub Enterprise)
Familiarity with monitoring tools: Prometheus, Grafana, PRTG
Excellent communication and stakeholder management skills
Experience with distributed systems and high-availability architectures
Knowledge of security and compliance frameworks (ISO27001, SOC 2)
Certifications in cloud technologies or ITIL
Experience with Agile, Scrum, and Atlassian Jira
Familiarity with Google Cloud AI & ML services, including:
- Vertex AI (end-to-end ML platform)
- AutoML (custom model training)
- BigQuery ML (machine learning in SQL)
- Cloud AI APIs (Vision, Natural Language, Translation)
- TensorFlow on Google Cloud

Other Attributes

Strategic thinker with strong problem-solving skills
Ability to thrive in a fast-paced, evolving environment
Collaborative and empathetic leadership style

Key Elements & Activities of the Role Leadership & Team Development

Be a hands-on leader who connects with direct reports, peers, and partners both operationally and strategically
Provide technical leadership and coaching, maintaining credibility in systems engineering, tools, and DevOps
Promote a culture of learning, collaboration, and continuous improvement through Agile and Scrum
Ensure the team has development pathways, meaningful objectives, and KPIs aligned to a clear technology roadmap

Reliability & Performance

Manage, optimise, and deliver Systems, DevOps, and ML Ops as a service to internal stakeholders
Define, publish, and measure Service Level Objectives (SLOs) and Indicators (SLIs)
Oversee incident response, service request fulfilment, change management, optimisation backlogs, and post‑implementation/incident reviews
Deliver efficiencies through problem management, release management, and continuous improvement
Leverage Google Cloud AI and other tools for predictive analytics and anomaly detection
Focus on consumption and cost-to-serve via demand shaping, capacity planning, and environment governance

Automation & Efficiency

Develop and propagate frameworks, pipelines, and system engineering templates across platforms
Evangelise engineering practices, microservices, CI/CD, infrastructure-as-code, and security-by-design
Partner with Technology, Delivery, and Support teams to ensure alignment between software development and platform engineering
Drive automation initiatives to promote self-help and self-enablement, reducing manual effort

Cross-Functional Collaboration

Build strong relationships with stakeholders across Technology, Engineering, Architecture, and Seeing Machines support services
Work closely with development teams to design scalable and resilient systems
Align priorities across engineering, product, and operations teams

Governance & Compliance

Influence architecture and governance standards to balance innovation, scalability, and compliance
Establish cloud governance policies, access controls, and compliance standards
Ensure systems are aligned to Seeing Machines DR and BCP expectations

Monitoring & Observability