The Challenge
Centogene, a leading biotech company specializing in rare diseases, had accumulated vast genomic datasets over years of research. Their scientists were spending up to 70% of their time on manual data processing and classification tasks, leaving limited bandwidth for actual scientific discovery.
The existing workflow relied heavily on spreadsheets, manual annotation, and rule-based scripts that couldn't keep pace with their growing data volume.
Our Approach
We embedded a team of two ML engineers alongside Centogene's bioinformatics group for 12 weeks. The engagement followed our standard three-phase process:
- Discovery (2 weeks) — mapped the existing workflow, identified bottlenecks, and defined success metrics with the science team.
- Prototyping (4 weeks) — built a custom NLP model for variant classification and an automated pipeline for data ingestion and quality checks.
- Production (6 weeks) — hardened the pipeline, integrated with existing LIMS systems, and trained the team on model monitoring.
The Solution
We delivered an end-to-end automated pipeline that ingests raw genomic data, performs quality validation, classifies variants using a fine-tuned transformer model, and presents results in a dashboard tailored for the science team. The system processes overnight what previously took a week of manual work.
"The No One team understood our domain deeply. They didn't just deliver a model — they delivered a system that fundamentally changed how our scientists work."
— Head of Bioinformatics, Centogene