Posted By
273
JOB VIEWS
95
APPLICATIONS
16
RECRUITER ACTIONS
See how you stand against competition
Pro
View Insights
Posted in
IT & Systems
Job Code
1518038
- The Head of Platform Resilience for Global Retail Banking Technology is responsible for designing, leading, and executing the resilience strategy for our global retail banking platforms.
- This role will ensure that all customer-facing platforms and critical applications are reliable, scalable, secure, and capable of withstanding and recovering from disruptions.
- This leader will collaborate across engineering, infrastructure, risk, and business teams to drive operational resilience, uphold high availability standards, and implement innovative approaches for continuous service delivery.
Qualifications:
What you will need to succeed in the role:
- 10+ years of experience in resilience engineering, site reliability engineering, infrastructure management, or a related field within large-scale technology environments.
- The Head of Platform Resilience for Global Retail Banking Technology is responsible for designing, leading, and executing the resilience strategy for our global retail banking platforms.
- Proven track record in managing resilience for complex, customer-facing applications, ideally within the banking or financial services industry.
- Strong understanding of platform engineering, high availability architectures, disaster recovery planning, and risk management.
- Deep experience in cloud (AWS, Azure, GCP) and hybrid environments, with knowledge of resilience engineering practices for cloud infrastructure.
- Excellent incident management skills and experience in root cause analysis, postmortem processes, and operational risk management.
- Ability to lead and influence global teams.
- Leading and directing executive and non-executive work groups and effecting change through people.
- Managing operational functions, directing process reengineering and efficiency exercises
- Strong people leadership, teamwork, gathering information and analyzing, judgment and decision making, communication competencies with the ability to influence global teams.
- Respectful of different cultures, working with colleagues from across all 5 regions (North America, LATAM, Middle East, Asia Pacific and Europe
- Consultancy approach and skillset with the ability to identify and articulate complex problems and solutions.
- Strong understanding of operational effectiveness and strong delivery drive.
What you'll do:
- Develop and execute the platform resilience strategy, aligning with the bank's broader business continuity and risk management frameworks.
- Define and champion best practices for resilience, recovery, and fault tolerance within the global retail banking technology ecosystem.
- Influence and drive cultural change around resilience by fostering proactive thinking around risk, disaster recovery, and high availability.
- Drive adoption of cloud-native, microservices, and containerization practices that support modular and resilient system design.
- Exceptional leadership and team management skills, with experience leading cross-functional teams and influencing at executive levels.
- Excellent communication skills, capable of translating complex technical details into business language for stakeholders at all levels.
- Oversee the design, implementation, and management of systems for resilience, failover, and disaster recovery for mission-critical banking applications.
- Deliver on Group Resiliency metrics and KPIs such as Meant Time to Recover, Number of Incidents, Customer Outage etc.
- Monitor system reliability, availability, and performance, and drive initiatives for continuous improvement.
- Lead incident response and postmortem processes, and drive learnings to improve systems and processes.
- This leader will collaborate across engineering, infrastructure, risk, and business teams to drive operational resilience, uphold high availability standards, and implement innovative approaches continuous.
- Drive proactive resilience measures, including chaos engineering, load testing, automated recovery and tabletop exercises, to simulate real-world disruptions.
- Partner with engineering, infrastructure, cybersecurity, and business teams to embed resilience into the development lifecycle, from initial design through to deployment and operation.
- Communicate effectively with senior leadership and key stakeholders on platform resilience status, risks, and improvements. Influence technology and business stakeholders to prioritize resilience as a critical component of product and feature development.
- Stay updated on industry trends, technologies, and methodologies related to resilience engineering, disaster recovery, and risk management.
- Lead initiatives to incorporate advanced technologies, such as AI/ML for anomaly detection and automated recovery, to enhance resilience capabilities.
- Ensure technology goals are identified, communicated, documented, agreed and delivered in the most cost-effective manner possible through to the completion of the successful pilot deployment.
- Proactively manage risks (including cyber security), implement control, and test control effectiveness,
- Responsible for ensuring control monitoring is conducted and reported to risk owners
- Supporting audit / independent programme assessments as required.
Didn’t find the job appropriate? Report this Job
Posted By
273
JOB VIEWS
95
APPLICATIONS
16
RECRUITER ACTIONS
See how you stand against competition
Pro
View Insights
Posted in
IT & Systems
Job Code
1518038
Download the iimjobs app to
apply for jobs anywhere, anytime
Download on
App Store
Get it on
Google Play
Scan to Download