Data Engineer - Data Engineering
Plaid
Responsibilities
- Understanding different aspects of the Plaid product and strategy to inform golden dataset choices, design and data usage principles.
- Have data quality and performance top of mind while designing datasets
- Advocating for adopting industry tools and practices at the right time
- Owning core SQL and Python data pipelines that power our data lake and data warehouse
- Well-documented data with defined dataset quality, uptime, and usefulness.
Qualifications
- 2+ years of dedicated data engineering experience, solving complex data pipeline issues at scale.
- You have experience building data models and data pipelines on top of large datasets (in the order of 500TB to petabytes)
- You value SQL as a flexible and extensible tool and are comfortable with modern SQL data orchestration tools like DBT, Mode, and Airflow.
- [Nice to have] You have experience working with different performant warehouses and data lakes; Redshift, Snowflake, Databricks
- [Nice to have] You have experience building and maintaining batch and real-time pipelines using technologies like Spark, Kafka.