Machine Learning & Data Engineer

Experis

  • Zurich
  • CDI
  • Temps-plein
  • Il y a 18 jours
Background:In Roche's Pharmaceutical Research and Early Development organization (pRED), we make transformative medicines for patients in order to tackle some of the world's toughest unmet healthcare needs. At pRED, we are united by our mission to transform science into medicines. Together, we create a culture defined by curiosity, responsibility and humility, where our talented people are empowered and inspired to bring forward extraordinary life-changing innovation at speed. This position is located in Data Products & Platforms, a chapter within the Data & Analytics function, which pushes boundaries of drug discovery and development, enabling pRED to achieve its goals.The perfect candidate:The Machine Learning and Data Engineer will be responsible for the end-to-end development and deployment of a semantic search vector database for research purposes and pRED scientific needs. This role requires a combination of skills in machine learning, data engineering, and software development.General Information:
  • Start date: 1.5.2024
  • latest Start Date: 1.7.2024
  • Planned duration: 31.12.2024
  • Extension (in case of limitation): possible
  • Max. Rate: CHF 135.00
  • Workplace: Basel, Zürich
  • Workload: 100%
  • Remote/Home Office: partially remote, partially in Basel
  • Travel: no
  • Team: 5
  • Used Template: CH_Senior Database Manager_IT_CHF_RCM
  • Hiring Manager: Agnes Meyder
  • Department: pRED Data & Analytics
  • Working hours: Standard
  • To what extent does this position have access to Roche products or is in a GMP-relevant environment: no
  • Is a criminal record extract required: no
Tasks & Responsibilities:
  • Integrate off-the-shelf open-source embedding models with the system to generate text embeddings from research publications and other text based sources.
  • Design and implement the data processing pipeline to handle the conversion of PDF, XML or other files into a suitable format for text embedding.
  • Set up and maintain the vector database infrastructure, ensuring efficient storage and retrieval of embeddings.
  • Develop and maintain the API for semantic search, allowing for robust querying capabilities.
  • Collaborate with stakeholders to gather requirements and ensure the system meets the needs of the organization.
  • Conduct testing and quality assurance to ensure the reliability and accuracy of the search results.
  • Document the system architecture, API usage, and operational procedures for future reference and maintenance.
Must Haves:
  • Strong programming skills, particularly in Python, and experience with machine learning libraries (e.g., TensorFlow, PyTorch) ()
  • Minimum 7 years Experience with data engineering tasks, including data extraction, transformation, and loading (ETL). ()
  • Familiarity with vector database technologies (e.g., FAISS, Milvus, Elasticsearch) and database indexing. ()
  • Knowledge of API development and best practices for scalability and security. ()
  • Ability to work independently, manage multiple priorities, and communicate effectively with both technical and non-technical stakeholders.
  • English fluent
Contact: Alba Jansa, +41 61 282 22 13,

Experis