AI NLP Data Engineer Leader (hp)
Job posting number: #159764 (Ref:hp-3143676)
Job Description
Job Summary
As an senior Data Engineer specializing in Natural Language Processing, you will play a critical role in designing, building, and maintaining the data sets that powers our NLP models and applications. You will work closely with architects, machine learning engineers, and software developers to ensure our data pipelines are robust, scalable, and efficient.
Key Responsibilities
Data Pipeline Development: Design, develop, and maintain scalable data pipelines to process and analyze large volumes of text data for NLP applications.
Data Integration: Integrate diverse data sources, including structured and unstructured data, to support NLP model training and evaluation.
Data Quality: Implement data quality checks and validation processes to ensure the integrity and accuracy of our datasets.
Collaboration: Work closely with architects and machine learning engineers to understand data requirements.
Documentation: Document data engineering processes, workflows, and best practices to ensure knowledge sharing and maintainability.
Qualifications
Education: Bachelor’s or Master’s degree in Computer Engineering, Electronic Engineering, Mathematics, or a related field.
5+ years work experience in data related projects.
Proficiency in programming languages such as Python, Java, or Scala.
Knowledge of SQL and NoSQL databases (e.g., PostgreSQL, MongoDB, Elasticsearch).
Understanding of NLP concepts and techniques, including text preprocessing, tokenization, and language models.
Excellent problem-solving and analytical skills.
Strong communication and collaboration abilities.
Preferred Qualifications
Experience with NLP libraries and frameworks such as NLTK, SpaCy, or Hugging Face Transformers.
Experience with other data types like image, audio, video and so on.
Familiarity with version control systems (e.g., Git).