Capability Leader - SRE || Retail Sports Giant|| Gurgaon
Industry - Retail
Category - IT & Systems
Job Type - Permanent
Job Description :
- Leading the SRE capability that ensures stability and reliability of products built and run on large scale, distributed systems which in turn provide exceptional, uninterrupted User Experience for our Web and Mobile platforms.
Client Details :
Our Client is a global giant in the Retail Sports space.
Description :
- You will allocate the different work to the respective employees considering experience, complexity, workload and organizational efficiency for team of 12 people
- You will continuously monitor and evaluate team workload and organizational efficiency with the support of IT systems, data and analysis and team feedback and makes appropriate changes to meet business needs.
- You will provide team members/direct reports with clear direction and targets that are aligned with business needs and GIT objectives
- You maintain and enhance monitoring framework (data collection, alert aggregation, dashboarding) and Implement and enhance alerting logic (framework)
- You enable proactive Incident alert and resolution leveraging knowledge scripts
- You identify and detect repetitive incidents (stability, reliability) and develop solutions to fix problems.
- You work on technical resolution for incidents and identify technical root cause
- You ensure tool standards, Exploit tool capability to fine tune product reliability
- You integrate incident, release, monitoring, alerting tools into overall ecosystem
- You ensure production release guidelines (entry/exit) and implementation are adhered to for changes to Production.
- You support CI/CD pipeline implementation and integration to quality and security.
Profile :
- At least 14 -year experience in IT
- 7 years of experience in relevant area (DevOps / SRE).
- Strong awareness and experience of working with Site Reliability Engineering principles.
- Good understanding of public cloud offerings such as AWS components like EC2, IAM, RDS, Cloudwatch etc.
- Knowledge in Messaging and Streaming frameworks like - RabbitMQ / Kafka
- Knowledge of server-side technologies such as Kubernetes, NodeJS, Docker, Java
- Hands on experience on enterprise tools set such as Grafana, Instana, Prometheus, ELK Stack etc.
- Has experience in any scripting language (bash / python / perl).
- Good experience with CI/CD pipelines including BitBucket, Jenkins
- Experience operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring into your code, tweaking dashboards, defining alerts.
- Knowledge of Agile software development principles including using JIRA.
- Experience with building Rest APIs, API Integration, and Web Services is preferred
Job Offer :
- Leadership Opportunity
- Opportunity to be a part of Global organisation, a leader in their space
To apply online please click the 'Apply' button below.
For a confidential discussion about this role please contact Shikha Nidhi on +91 124 452 5435.
For your candidature to be considered on this job, you need to apply necessarily on the company's redirected page of this job. Please make sure you apply on the redirected page as well.