MAKING SUCCESS STORIES HAPPEN
 

About the Team

The technology team is responsible for building and maintaining a scalable, reliable digital banking platform. The Site Reliability Engineering (SRE) function plays a critical role in ensuring platform stability, availability, and security while supporting rapid product development and growth.

The team works closely with engineering, security, and platform stakeholders to enable high-quality software delivery in a cloud-native environment.

About the Role

This role focuses on driving Site Reliability Engineering and DevSecOps practices across the platform. You will support the design, implementation, and operation of infrastructure and tooling that enables reliable, secure, and efficient software delivery.

You will work closely with other SREs to manage infrastructure as code, CI/CD pipelines, cloud-native services, automation frameworks, and release management processes in alignment with business and technical objectives.

Key Responsibilities

  • Design and implement highly available, secure, and scalable infrastructure using infrastructure as code (IaC)
  • Define and maintain service level objectives (SLOs) and service level indicators (SLIs)
  • Lead incident response during critical service disruptions and coordinate cross-functional teams
  • Build and maintain observability and monitoring solutions covering uptime, latency, saturation, and error rates
  • Develop automation for incident response, disaster recovery, and business continuity
  • Enable DevSecOps practices through CI/CD, GitOps, and self-service deployment capabilities
  • Drive cost optimisation initiatives related to cloud infrastructure (FinOps)
  • Mentor junior engineers and contribute to operational best practices

Requirements

  • Strong experience with cloud platforms (AWS, Azure, GCP, or Alibaba Cloud)
  • Hands-on experience with CI/CD pipelines and GitOps workflows
  • Strong expertise in infrastructure as code tools (Terraform, OpenTofu, CloudFormation, or similar)
  • Experience designing and implementing observability and monitoring solutions
  • Experience leading incident management and post-mortem processes
  • Solid understanding of Java or another object-oriented programming language
  • Strong knowledge of containerisation and orchestration technologies
  • Experience with messaging systems (e.g. Kafka) is an advantage
  • Familiarity with relational and non-relational databases is a plus
  • Ability to balance hands-on technical work with strategic decision-making
  • Strong problem-solving skills and ability to perform under pressure
  • High ownership, accountability, and commitment to delivery
  • Financial services experience is a plus but not mandatory
立即申請: SRE
參考編號: MY874878

請填寫所有的必填欄位 *

*

*

*

*

*

MS Word、PDF、HTML和TXT格式。

通過Linkedin帳號申請出現問題? 點擊這裡

*
您通過提交此表格提供的個人資料,將按照我們的隱私聲明處理
我同意 Morgan Philips 的隱私聲明

SRE
Kuala Lumpur, Federal Territories, Malaysia | Permanent