DevOps Lead

Full Time
80 Mildura St, Fyshwick ACT 2609, Australia
3 hours ago
Main Purpose of Role

The DevOps Lead ensures the reliability, scalability, and performance of critical systems and services. This role bridges development and operations, fostering a culture of automation, resilience, and continuous improvement. The manager leads a team of SREs to apply best practices, manage incidents, and drive operational excellence.

Qualifications, Skills and Experience: 

  • Bachelor’s degree in Computer Science, Engineering, or related field

  • Proven experience in SRE or DevOps leadership roles

  • Strong knowledge of:

    • Cloud platforms (AWS, Azure, GCP)

    • Container orchestration (Kubernetes, Docker)

    • Infrastructure automation (Terraform, Ansible, Jenkins, Lava)

  • Expertise in programming languages (Python, Java)

  • Proficiency with source control systems (GitHub Enterprise)

  • Familiarity with monitoring tools: Prometheus, Grafana, PRTG

  • Excellent communication and stakeholder management skills

  • Experience with distributed systems and high-availability architectures

  • Knowledge of security and compliance frameworks (ISO27001, SOC 2)

  • Certifications in cloud technologies or ITIL

  • Experience with Agile, Scrum, and Atlassian Jira

  • Familiarity with Google Cloud AI & ML services, including:

    • Vertex AI (end-to-end ML platform)

    • AutoML (custom model training)

    • BigQuery ML (machine learning in SQL)

    • Cloud AI APIs (Vision, Natural Language, Translation)

    • TensorFlow on Google Cloud

Other Attributes
  • Strategic thinker with strong problem-solving skills

  • Ability to thrive in a fast-paced, evolving environment

  • Collaborative and empathetic leadership style

Key Elements & Activities of the Role Leadership & Team Development
  • Be a hands-on leader who connects with direct reports, peers, and partners both operationally and strategically

  • Provide technical leadership and coaching, maintaining credibility in systems engineering, tools, and DevOps

  • Promote a culture of learning, collaboration, and continuous improvement through Agile and Scrum

  • Ensure the team has development pathways, meaningful objectives, and KPIs aligned to a clear technology roadmap

Reliability & Performance
  • Manage, optimise, and deliver Systems, DevOps, and ML Ops as a service to internal stakeholders

  • Define, publish, and measure Service Level Objectives (SLOs) and Indicators (SLIs)

  • Oversee incident response, service request fulfilment, change management, optimisation backlogs, and post‑implementation/incident reviews

  • Deliver efficiencies through problem management, release management, and continuous improvement

  • Leverage Google Cloud AI and other tools for predictive analytics and anomaly detection

  • Focus on consumption and cost-to-serve via demand shaping, capacity planning, and environment governance

Automation & Efficiency

  • Develop and propagate frameworks, pipelines, and system engineering templates across platforms

  • Evangelise engineering practices, microservices, CI/CD, infrastructure-as-code, and security-by-design

  • Partner with Technology, Delivery, and Support teams to ensure alignment between software development and platform engineering

  • Drive automation initiatives to promote self-help and self-enablement, reducing manual effort

Cross-Functional Collaboration
  • Build strong relationships with stakeholders across Technology, Engineering, Architecture, and Seeing Machines support services

  • Work closely with development teams to design scalable and resilient systems

  • Align priorities across engineering, product, and operations teams

Governance & Compliance
  • Influence architecture and governance standards to balance innovation, scalability, and compliance

  • Establish cloud governance policies, access controls, and compliance standards

  • Ensure systems are aligned to Seeing Machines DR and BCP expectations

Monitoring & Observability
  • Enable monitoring systems, standards, and services that support predictive and reactive responses

  • Develop and publish dashboards showing system health across Seeing Machines

  • Deliver information and reporting according to an agreed cadence

Key Liaisons Internal
  • Technology Division

  • Enterprise Systems & Services Department

  • Project Leads

  • All SM senior stakeholders

External
  • Product Vendors

  • Service Providers