Egen is a data engineering and cloud modernization firm helping industry-leading companies achieve digital breakthroughs and deliver for the future, today. We are catalysts for change who create digital breakthroughs at warp speed. Our team of cloud and data engineering experts are trusted by top clients in pursuit of the extraordinary. An Inc. 5000 Fastest Growing Company 7 times, and recently recognized on the Crain’s Chicago Business Fast 50 list, Egen has also been recognized as a great place to work 3 times.
As an Egen Data Engineer, you will be responsible for building and implementing distributed ETL/ELT data pipelines to enable the processing of huge data sets, solving ingestion and data modeling challenges at scale, and continuously improving data processing turnaround and usability.
The ideal candidate is very resourceful and has strong production experience working with Python, complex SQL procedures, relational and NoSQL databases, distributed data warehouses, data mapping, data transformations, and data integration.
Responsibilities:
- Design, build and improve scalable and resilient ETL pipelines and integrate with cloud-native data warehouses (Snowflake/Redshift/BigQuery/ADW) and relational or NoSQL databases.
- Follow and manage best practices and standards for data quality, scalability, reliability, and reusability.
- Debug production issues across data platform services.
- Partner with the business, product, and data science teams to automate processes to improve data sets for analytical and reporting needs.
- Write test cases, perform QA and participate with stakeholders on UAT
- Automate...automate...automate.
Required Experience:
- Bachelor’s degree in a relevant field
- Must have strong professional knowledge of and experience building event-driven and/or batch-processed ETL data pipelines
- Enterprise data migration and multi-source ingestion experience
- Proficiency in Python, complex SQL procedures, and familiarity with distributed data warehouses architecture and processing.
Nice to have's (but not required):
- Familiarity with event-driven architecture using Kafka, Airflow, and/or Spark.
- Experience with Docker, Kubernetes.
- Develop and deploy CICD pipelines for Data Engineering
- Familiarity with Healthcare data