DBS - SRE Observability Engineer

Location: Singapore
Business sector: IT Support
Job reference: 545550
Published: almost 2 years ago
Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels. 

What You'll Be Doing 
  • Responsible for delivery of activities on the monitoring roadmap.
  • Implement, maintain, and consult on the observability and monitoring framework that supports the needs of multiple internal stakeholders.
  • Collaborate with operations & engineering teams, application developers, management and infrastructure teams to assess near- and long-term monitoring needs.
  • Assist with driving monitoring and observability standards to improve the consumer experience of mission-critical applications, services, and business processes with a strong focus on the end-to-end journey.
  • Keep an eye on the emerging observability tools, trends and methodologies, and continuously enhance our existing systems and processes.
  • Manage AppDynamics/Elastic/Prometheus/Grafana to support custom metric delivery dashboards.
  • Grow and evangelize the capabilities of the AppDynamics platform.
  • Build a practice of performance and tracing using AppDynamics APM.
  • Engineer solutions and establish standards for AppDynamics functional components and specifically agent deployments, including optimizations and application tuning and instrumentation per requirements.
  • Adopt AppDynamics for usage on-premises with familiarity of Cloud IaaS and PaaS models.
  • Architect a highly available and scalable Controller infrastructure with appropriate monitoring and alerting mechanisms.
  • Seek opportunities through scripting automated deployments to reduce operational tasks.
  • Seek opportunities for integration of AppDynamics with other monitoring tools.
  • Test and implement AppDynamics Agents.
  • Test and implement AppDynamics End User Experience Monitoring.
  • Effectively communicate tool capabilities and processes to varying stakeholders.
  • Assist in scheduling and hosting regular tool training sessions to better enable tool adoption and best practices.
  • Provide input on improving the global operating model for monitoring and observability services.
  • Continue evolving monitoring tooling toward a standards-based self-service automated platform.
  • Participate in Blameless Incident Retrospectives and follow up on action items influenceable by observability.
  • Work with application teams for Observability, automating monitoring and auto-remediation of known issues.
  • Work with teams located across locations in Asia Pacific.

 Skills You'll Need
  • Previous experience defining, creating, and supporting monitoring dashboards.
  • Experience working across departments evangelizing and communicating observability expertise and standards.
  • Possess practical knowledge and appreciation of various aspects of distributed service design, including messaging protocols, caching strategies and autonomous software design practices.
  • Experience with monitoring and observability solutions and methodologies including server and network performance, hardware, web synthetics, and application performance monitoring a plus.
  • Experience with monitoring and observability tools and methodology of products such as; AppDynamics, Elasticsearch, Grafana, Prometheus.
  • Solid understanding of performance metrics, KPI’s, statistical calculations, machine learning, and correlation.
  • Have extensive experience with metrics and logging libraries and aggregators, data analysis and visualization tools.
  • 2+ years working with APM (ideally AppDynamics) in deployment for mission critical applications.
  • Strong AppDynamics product knowledge, internals, product REST API and ability to develop AppDynamics Monitoring Extensions.
  • 2+ years of Application Development Experience (Java) at an enterprise level is a strong plus.
  • 2+ years of Experience with a range of architecture tech stacks including Java app servers, Web Servers, Golang, Kubernetes, OpenShift, PCF, AWS, Google Cloud.
  • Ability to converse with application owners, architects, performance testers to pinpoint application performance bottlenecks via the monitoring & observability tools.
  • Bachelor’s or Master’s degree in Computer Science, a related technical field that involves programming, or equivalent practical experience.
  • Minimum of 10 years technology experience (preferably in the financial industry).
  • Project Management experience.
  • Ability to work independently, multi-task, and take ownership of various parts of a project or initiative.
  • Highly motivated, pro-active, and capable of working under pressure without compromising development processes and productivity.
  • Experience working with diverse stakeholders, including operations, application developers and performance testing.
  • Strong, committed, and reliable team player, able to take direction but also willing to contribute to discussions on design and strategy.
  • Possess strong interpersonal and communication skills to be able to deal with and form good relationships with the business and other technology groups through day-to-day support and project work.
  • Interest in financial technologies, new technology tools and the ability to learn.