Job no: 6JDJ7
We are seeking a highly skilled and hands-on Senior Platform Engineer to join our growing team. In this role, you’ll work at the intersection of cloud infrastructure, platform reliability, data systems, DevOps, and automation, contributing to high-impact projects that power critical banking operations and data-driven decision-making.
You will design and build robust, scalable platform services, support production and development environments, and automate end-to-end infrastructure and CI/CD processes. You’ll also contribute to monitoring, incident response, and optimization of large-scale distributed systems.
Key Responsibilities
- Environment & Platform Support
- Manage and support SIT, Performance, UAT, Dev, and Production environments across hybrid cloud setups.
- Architect scalable cloud-native solutions on Azure, AWS, and GCP, optimizing reliability and resource utilization.
- Monitoring & Observability
- Build and maintain monitoring dashboards using Grafana, OpenSearch, Kibana, and tools like Nobl9 for SLO management.
- Define and implement alerting strategies, backups, and observability pipelines for mission-critical systems.
- Automation & CI/CD
- Automate infrastructure and deployment pipelines using Terraform, GitHub Actions, Jenkins, and Ansible.
- Implement robust CI/CD for applications, data pipelines, and containerized services using Docker and Kubernetes.
- Troubleshooting & Incident Management
- Lead root cause analysis and performance tuning using system logs, cloud-native tools, and incident response frameworks.
- Build incident response tools powered by AI/ML (e.g., RAG-based LLMs, TensorFlow-based anomaly classification).
- Security & Change Management
- Drive vulnerability remediation through patching and automated security compliance workflows.
- Champion change management using ServiceNow, participating in structured change control and release planning.
- Data & AI Platform Enablement
- Design and deploy data visualizations and AI-driven automation solutions for internal analytics, using VertexAI, Langchain, BigQuery, Spark, and Airflow.
- Support large-scale data lake platforms, with emphasis on data quality, performance, and compliance.
Must have skills and Experience:
- 6+ years of experience in platform engineering, DevOps, SRE, or cloud infrastructure roles.
- Proven ability to work across multi-cloud environments (Azure, AWS, GCP) and on-premise/hybrid setups.
- Strong programming and scripting skills in Python, Go, Java, or Node.js.
- Deep understanding of observability, infrastructure as code, and CI/CD practices.
- Experience supporting data platforms, AI/ML pipelines, or cloud-native services at scale.
- Skills and Experience with tools such as:
- Terraform, Jenkins, GitHub Actions, Docker, Kubernetes
- OpenSearch, Grafana, AppDynamics, Kibana, Airflow
- ServiceNow, Ansible, Spark, MySQL/Oracle/MongoDB
- Excellent communication, documentation, and stakeholder engagement skills.
- A mindset of continuous learning, mentorship, and technical excellence.
- Experience with LLM integration in incident response tooling (e.g., Langchain, RAG, Vector Search).
Nice to Have
- Background in machine learning platforms, including model deployment and monitoring.
- Contributions to automation of compliance or security patching processes.
- Participation in agile, DevSecOps, or Site Reliability Engineering (SRE) practices.
Why Join Us?
- Work with cutting-edge cloud, AI/ML, and DevOps technologies
- Be a key contributor to resilient banking infrastructure used by millions
- Join a collaborative team that values innovation, mentorship, and impact
- Competitive compensation, flexible working, and career development opportunities
- Published on 26 Jun 2025, 5:35 AM