Singapore Tourism Board - Data Engineer to Associate Data Engineer

Location: Singapore
Business sector: Data Engineering with Machine Learning Fundamentals
Job reference: 606794
Published: about 2 months ago
The Singapore Tourism Board (STB) is a statutory board under the Ministry of Trade and Industry of Singapore. It champions the development of Singapore's tourism sector, one of the country's key service sectors and economic pillars, and undertakes the marketing and promotion of Singapore as a tourism destination.

What You'll Be Doing


Support the Data Science team in:
  • Helping to project manage, coordinate and implement DS&A's data ingestion and data processing pipelines across different platforms
  • Ensuring that all data systems meet our business requirements and enable scalability of business processes
  • Project manage and deliver on data related implementations ensuring that deliverables are met within agreed scope and timelines
  • Work closely with vendors and internal stakeholders to project manage and coordinate DS&A’s data ingestion and data processing pipelines across platforms which can include mobile apps, SaaS platforms, on-premise databases and partner systems
  • Help architect DS&A’s data integrations and data processing flows between external / 3rd party data sources, AWS cloud data warehouses (e.g. Redshift) and internal on-premise database instances for workloads at scale
  • Help to gather and translate business requirements into relevant database schemas, data integrations and data processing flows to meet business objectives
  • Develop data integrations (through API, SFTP etc) between AWS S3, Redshift instances and on-premise database instances (e.g. HANA)
  • Assemble large, complex datasets that meet functional and non-functional business requirements
  • Analyse and assess the effectiveness and accuracy of new data sources (e.g. datasets received from stakeholders) and annotation/ labelling of new training inputs.
  • Identify, design and implement internal process improvements: automating manual processes, data validation tools, optimising data delivery, re-designing infrastructure for greater scalability, etc.
  • Recommend different ways to constantly improve data reliability and quality, including helping review and enhance the existing data collection procedures to include data for building analytics models relevant for industry transformation
  • Develop monitoring toolkits to ensure that integration is executed successfully and alerts where integrations have failed
  • Provide guidance to internal teams on best practices for cloud to on-premise data integrations
  • Develop set processes for data mining, data modelling and data production
  • Support the integration and deployment of developed algorithms, machine learning and analytical models into current analytics system/production
  • Help setup, configure, deploy and validate machine learning models and analytics scripts on Amazon Sagemaker
  • Help in the implementation of CI/CD and deployment of ML models in production
Skills You'll Need

  • At least 2-3 years of Software Project Management experience successfully managing both internal stakeholders and external vendors.
  • Successfully delivered at least 2 medium to large scale software systems in either a Project Management role, Data Architect role or Data Integration role
  • Ability to understand the different business domains and to make connections between the data and the business needs.
  • Good and strong communication skills and able to explain the issues, design tradeoffs between performance, maintenance and business requirements.
  • Able to clearly articulate and justify the design decisions taken
  • Good attention to details with regards to data workflow, data quality, data integrity and how the data will be stored and accessed.
  • Strong analytics skills related to working with structured and unstructured datasets.
  • Experience in performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Experience in designing database schemas to support OLTP and OLAP systems
  • Experience with data pipeline tools (e.g. Talend, SSIS, BODS, Airflow, Kafka)
  • Experience in software development and developing enterprise applications with integrations to SQL / no-SQL databases
  • Experience with object-oriented / object function scripting languages: Python, R, Java, etc.
  • Intellectual curiosity to find new and unusual ways of how to solve data management issues.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Experience in using Qliksense will be advantageous.
  • Experience with big data tools will be a plus - Hadoop, Spark, Hive, Sqoop, etc.
  • Experience with stream-processing systems: Storm, Spark-Streaming, Kalfka etc.
  • Strong working knowledge of SQL
  • Strong project management, stakeholder management and organisational skills.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • At least 3 years of working experience in a related field with real-world skills and testimonials from formal employers.
  • Working experience with structured and unstructured datasets is essential
  • Certified Scrum Master/ Agile Developer is an added advantage
  • Certified AWS Cloud Architect is an added advantage