Breakthrough is seeking a Data Engineering intern that will be based in Green Bay, WI for Summer 2022. This position will be part of the Technology Solutions team focused on designing, implementing and documenting data architecture and data modeling solutions. We are seeking an individual to fill this internship who is passionate about building data solutions which includes the use of relational, dimensional and NoSQL databases. These solutions support enterprise information management, business intelligence, machine learning, and data science.
This internship may be remote or at our office in Green Bay, WI.
Develop conceptual, logical and physical data models, implement operational data stores, data marts and contribute to the ongoing expansion of the data lake platform.
- Design and build data processing components and systems utilizing GCP compute technologies.
- Design requirement driven data models.
- Build data pipelines using distributed computing technologies.
- Work closely with data scientists to develop and build data science and analytical products that integrate real time data sources.
- Acquire, analyze, combine, synthesize and store data from a wide range of internal and external sources.
- Build and test CI/CD deployment pipelines for data system components.
What qualities we are looking for:
- An individual who is passionate about data engineering and building data driven products.
- An individual who strives for excellence with a laser focus on team communication and facilitation of ideas.
- An individual with a proactive attitude who works well in a fast-paced team environment.
- An individual who communicates and collaborates well with IT and business teams.
- A critical thinker with good problem-solving skills and an ability to multi-task.
- Strong communication skills to express oneself clearly both verbally and in writing. Persistent, active listening skills.
- Demonstrated leadership skills with a willingness to readily and voluntarily take ownership of project issues.
- An individual that can work both at the strategic level and at the tactical level, holding others accountable while building team rapport and engagement.
Desired qualifications and skills we are looking for:
- Working toward a Bachelor’s degree or advanced degree in data engineering or related field
- Understanding of distributed data management systems and related applications
- Understanding of data lake design and implementation considerations such as columnar storage formats and partitioning
- Comfortable with concepts around building and automating data system components that enable data acquisition, cleansing, and persistence engineering; monitoring the performance of data analysis and system components and versioning of data snapshots, data lineage, schemas, and overall database systems
- Exposure to deployment processes through a CI/CD pipeline
- Python and cloud compute skills
Additional qualifications and skills considered a plus:
- Experience with GCP technologies or related Cloud technology experience
- Experience with analytics tools such as Apache Beam, Spark, JupyterLab, etc.
- Experience with modern infrastructure as code technologies like Docker, Kubernetes, Terrraform, and Airflow
- Exposure to the full data engineering life-cycle, from business understanding to building operational systems.
- Understanding navigating all types of database models and DBMS’s.
- Must have excellent communication skills and can execute alone but is an awesome team player.
- Big picture approach a plus – able to incorporate business understanding into design and approach to achieve current value and prepare for future benefit.