We are looking for a skilled Data Engineer to join our Tech team. The ideal candidate will possess experience in SQL and Python for developing data pipelines
Responsabilities
The incumbent for the position is expected to deliver but not limited to the following responsibilities:
- Collaboration and Support: Collaborate with cross-functional teams to understand data requirements and provide technical support for data-related initiatives and projects.
- Data Pipeline Implementation: Design, develop and maintain scalable data pipelines to ingest, transform, and load data from various sources into cloud-based storage and analytics platforms using Python, PySpark, and SQL
- Data Infrastructure Development: Design, build, and maintain scalable data infrastructure on Azure / AWS / GCP using PySpark for data processing to support various data initiatives and analytics needs within the organization
- Performance Optimization: - Optimize data processing workflows and cloud resources for efficiency and cost-effectiveness, leveraging PySpark's capabilities. - Implement data quality checks and monitoring to ensure the reliability and integrity of data pipelines.
- Build and optimize data warehouse solutions for efficient storage and retrieval of large volumes of structured and unstructured data.
- Data Governance and Security: Implement data governance policies and security controls to ensure compliance and protect sensitive information across Azure, AWS and GCP environments.
An ideal candidate will be/have:
- Bachelor’s degree in computer science, Engineering, Statistics, Mathematics, or related field. Master's degree preferred
- 2 years of experience as Data Engineer
- Cloud data storage is mandatory
- Strong understanding of data modeling, ETL processes, and data warehousing concepts
- Experience in SQL language, relational data modelling and sound knowledge of Database administration is mandatory
- Proficiency in Python related to Data Engineering for developing data pipelines, ETL (Extract, Transform, Load) processes, and automation scripts.
- Proficiency in Microsoft Excel
- Experience within integrating data management into business and data analytics is mandatory
- Experience working with cloud platform for deploying and managing scalable data infrastructure
- Experience working with technologies such as DBT, airflow, snowflake, Databricks among others is a plus
- Excellent Stakeholder Communication
- Familiarity with working with numerous large data sets
- Comfort in a fast-paced environment
- Strong analytical skills with the ability to collect, organize, analyses, and disseminate significant amounts of information with attention to detail and accuracy
- Excellent problem-solving skills
- Advanced English is mandatory
- Strong interpersonal and communication skills for cross-functional teams
- Proactive approach to continuous learning and skill development
Información Adicional
Hybrid in CDMX (3 days on-site and 2 days home office) 2 DP days and 5 extra vacation days
Tipo de puesto: Tiempo completo, Por tiempo indeterminado
Sueldo: $30,000.00 - $45,000.00 al mes
Pregunta(s) de postulación:
- ¿Actualmente vives en CDMX? ¿Te es viable trabajar 2-3 días por semana en oficinas en Santa, Fe, CDMX?
Experiencia:
- Python: 3 años (Obligatorio)
- Pyspark: 2 años (Obligatorio)
- ETL pipelines: 2 años (Obligatorio)
- Cloud services: 1 año (Obligatorio)
Idioma:
- inglés avanzado (Obligatorio)
Lugar de trabajo: remoto híbrido en 01376, Santa Fe, CDMX
Reportar empleo