What You'll Be Doing
- Drive the Site Reliability Engineering agenda forward at an Enterprise Level to improve availability, reliability, and performance of services.
- Drive cross-team efforts in resiliency assessment exercises and reporting
- Draft and/or contribute to internal SRE training materials
- Support services before they go live through activities such as Chaos testing (failure injection), system design inputs, developing software platforms and frameworks, capacity planning and launch reviews.
- Engage with product engineering teams to test against relevant Chaos Engineering tool kit.
- Sounds understanding of CI/CD pipelines and SDLC (application delivery)
- Assist application teams in setting up SLI, SLO and Error budget for the system/s
- Participate in Blameless Incident Retrospectives and follow up on action items
- Work with application teams for Observability, automating monitoring and auto-remediation of known issues.
- Programming and scripting to automate failure scenarios, integration with pipelines and developing self-service portals.
- Work with teams located across locations in Asia Pacific
- MSc or BSc graduate in Computer Science, Statistics/Mathematics, or a related field is required.
- A minimum of 3 years relevant work experiences in machine learning, AI and BI domain. Those with more years will be considered for the senior role.
- Solid hands-on experience in AI programming, ML modelling, statistical methods used in data sciences and emerging technologies.
- Strong working knowledge of Python/R, SQL, Big Data and Cloud computing technologies.
- Development experience in microservices/containers, API development and middleware in data extraction.
- Proficiency in machine learning framework such as sklearn, Tensorflow, Keras etc.
- Technical skills in various BI platforms, Big Data platforms like Hadoop/Spark, NoSQL etc.
- Familiarity in developing dashboards/reports via BI visualization tools like Power BI etc.
- Good in understanding business requirements and interested in driving practical outcomes.
- Familiarity with Talend as an ETL tools for data integration.
- Familiarity with AutoML will be a value added.
- Proficiency in DevOps process and tools (Eg. Jira or Azure DevOps) will be value added.
- Strong communication and collaboration skills to work with people from a variety of business and technical background.
- Good time management and team player.
- Experience in FMCG industry will be an added advantage.