Machine Learning and Data Engineer

CTC Resourcing Solutions

  • Bâle
  • Contrat
  • Temps-plein
  • Il y a 19 jours
  • Postuler facilement
The Life Science Career Network – CTC are specialised industry experts who can help companies source the best talent and provide reliable HR and consulting services, support varied candidates in finding promising career opportunities and offer the latest in skill development training programmes.For our client, a leading International Pharma company located in Basel, Switzerland, we are looking for aMachine Learning and Data Engineer Pharma (5850)in a contracting setting initially planned until end of 2024.This position is located in Data Products & Platforms, a chapter within the Data & Analytics function, which pushes boundaries of drug discovery and development, enabling Early Research to achieve its goals.The Machine Learning and Data Engineer will be responsible for the end-to-end development and deployment of a semantic search vector database for research purposes and Early Research scientific needs. This role requires a combination of skills in machine learning, data engineering, and software development.Tasks:
  • Integrate off-the-shelf open-source embedding models with the system to generate text embeddings from research publications and other text based sources.
  • Design and implement the data processing pipeline to handle the conversion of PDF, XML or other files into a suitable format for text embedding.
  • Set up and maintain the vector database infrastructure, ensuring efficient storage and retrieval of embeddings.
  • Develop and maintain the API for semantic search, allowing for robust querying capabilities.
  • Collaborate with stakeholders to gather requirements and ensure the system meets the needs of the organization.
  • Conduct testing and quality assurance to ensure the reliability and accuracy of the search results.
  • Document the system architecture, API usage, and operational procedures for future reference and maintenance.
Requirements:
  • Strong programming skills, particularly in Python, and experience with machine learning libraries like TensorFlow, PyTorch
  • Min 7 years experience with data engineering tasks, including data extraction, transformation, and loading (ETL)
  • Familiarity with vector database technologies (e.g., FAISS, Milvus, Elasticsearch) and database indexing.
  • Knowledge of API development and best practices for scalability and security.
  • Ability to work independently, manage multiple priorities, and communicate effectively with both technical and non-technical stakeholders.
  • English fluent
Interested? Contact us!

CTC Resourcing Solutions