Machine Learning & Data Engineer

Experis

Zurich
CDI
Temps-plein

Il y a 18 jours

Background:In Roche's Pharmaceutical Research and Early Development organization (pRED), we make transformative medicines for patients in order to tackle some of the world's toughest unmet healthcare needs. At pRED, we are united by our mission to transform science into medicines. Together, we create a culture defined by curiosity, responsibility and humility, where our talented people are empowered and inspired to bring forward extraordinary life-changing innovation at speed. This position is located in Data Products & Platforms, a chapter within the Data & Analytics function, which pushes boundaries of drug discovery and development, enabling pRED to achieve its goals.The perfect candidate:The Machine Learning and Data Engineer will be responsible for the end-to-end development and deployment of a semantic search vector database for research purposes and pRED scientific needs. This role requires a combination of skills in machine learning, data engineering, and software development.General Information:

Start date: 1.5.2024
latest Start Date: 1.7.2024
Planned duration: 31.12.2024
Extension (in case of limitation): possible
Max. Rate: CHF 135.00
Workplace: Basel, Zürich
Workload: 100%
Remote/Home Office: partially remote, partially in Basel
Travel: no
Team: 5
Used Template: CH_Senior Database Manager_IT_CHF_RCM
Hiring Manager: Agnes Meyder
Department: pRED Data & Analytics
Working hours: Standard
To what extent does this position have access to Roche products or is in a GMP-relevant environment: no
Is a criminal record extract required: no

Tasks & Responsibilities:

Integrate off-the-shelf open-source embedding models with the system to generate text embeddings from research publications and other text based sources.
Design and implement the data processing pipeline to handle the conversion of PDF, XML or other files into a suitable format for text embedding.
Set up and maintain the vector database infrastructure, ensuring efficient storage and retrieval of embeddings.
Develop and maintain the API for semantic search, allowing for robust querying capabilities.
Collaborate with stakeholders to gather requirements and ensure the system meets the needs of the organization.
Conduct testing and quality assurance to ensure the reliability and accuracy of the search results.
Document the system architecture, API usage, and operational procedures for future reference and maintenance.

Must Haves:

Strong programming skills, particularly in Python, and experience with machine learning libraries (e.g., TensorFlow, PyTorch) ()
Minimum 7 years Experience with data engineering tasks, including data extraction, transformation, and loading (ETL). ()
Familiarity with vector database technologies (e.g., FAISS, Milvus, Elasticsearch) and database indexing. ()
Knowledge of API development and best practices for scalability and security. ()
Ability to work independently, manage multiple priorities, and communicate effectively with both technical and non-technical stakeholders.
English fluent

Contact: Alba Jansa, +41 61 282 22 13,

Experis

Postuler