Lead Data Engineer Job at WorkHQ, Los Angeles, CA

UU1ma2x1aTd4MVpQUWF1MmczVFdiOVJIdlE9PQ==
  • WorkHQ
  • Los Angeles, CA

Job Description

Company Context

Series A, well-funded US startup in HRTech developing WorkHQ.com and an AI Recruiter product.

This is a US-only, Remote role (Mainland).

Role Overview

Lead data infrastructure architect managing billions of data points across 250M+ professional profiles.

Hire data engineers to aid you in that journey.

Core Responsibilities

  • Design scalable data pipelines processing massive record volumes

  • Architect ETL processes using PySpark on Amazon EMR (Open to shifting to other solutions like Data Bricks / Snowflake)

  • Distribute enriched data through medallion architecture across Postgres, Athena, OpenSearch

  • Integrate new data sources into the main pipeline

  • Implement advanced data matching using Splink

Technical Requirements

  • 5-8 years professional data engineering experience

  • Good proficiency in:

    • PySpark and distributed computing

    • AWS data services (EMR, Glue, Athena)

    • Docker

    • Pandas and DataFrame manipulation

    • Complex data format handling (JSONL, Parquet)

  • Strong background in:

    • Big data processing architectures

    • Data warehouse design

    • Performance optimization

  • Advanced Python, SQL skills

Nice to Have

  • Probabilistic record linking expertise

  • OpenSearch/elasticsearch technologies

  • Machine learning data pipeline design

  • Recruitment tech ecosystem knowledge

Technical Stack

  • Big Data: PySpark, EMR

  • Databases: Postgres, OpenSearch

  • Cloud: AWS

  • Containerization: Docker

  • Data Formats: JSONL, Parquet

  • Analytics: Metabase, Athena, Glue

  • Data Processing: Pandas, Splink

Other Considerations

While this role has specific requirements - if you lack a few technical skills, but motivated to learn and lead the platform, please apply for consideration.

If you are coming from Director/Head of/VP levels that is relevant to this job, you can apply as well.

You will need to apply directly on our platform.

Thank you for your time.

Job Tags

Permanent employment, Shift work,

Similar Jobs

Capital One

Senior Manager, Data Scientist - US Card (Generative AI Systems) Job at Capital One

Senior Manager, Data Scientist - US Card (Generative AI Systems)Data is at the center of everything we do. As a startup, we disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and the relational database,... 

Hilton

On-Call Banquet Server - DoubleTree Suites Austin Job at Hilton

 ...DoubleTree by Hilton Austin is hiring a Banquet Server, On- Call! Located in the heart of downtown, DoubleTree by Hilton Austin offers team members the opportunity to work at a vibrant, full-service hotel steps away from Lady Bird Lake, the Austin Convention Center... 

vTech Solution

Cybersecurity Analyst Job at vTech Solution

 ...Responsibilities: - Support the development and scoping of the risk assessment plan. - Identify, assess, and document applicable security controls. - Collaborate with stakeholders to collect data and evidence. - Perform detailed analysis of access control... 

Openkyber

IAM Engineer Job at Openkyber

 ...privilege management expertise Experience with MFA, SSO, Kerberos, certificate-based auth Knowledge of Zero Trust, NIST, ITDR, CIS controls Scripting: PowerShell / Python / Bash / Terraform Excellent documentation and communication skills... 

Washington Metropolitan Area Transit Authority

Police Training Curriculum Design Administrator Job at Washington Metropolitan Area Transit Authority

 ...: Unlock the future of law enforcement training with the Police Training Curriculum Design Administrator with the Metro Transit Police Department. This pivotal role is dedicated to developing, implementing, and evaluating comprehensive educational programs tailored...