Meet Rebecca: Building the Data Foundations Behind AI-Powered Clinical Trials
As part of our “Meet the Builders” series we spoke to Rebecca Keegan, one of our Senior Software Engineers, to talk about her unusual path into data engineering, the complexity of medical imaging pipelines, and the challenges of scaling infrastructure for AI.
What originally drew you to software engineering?
What attracted me was the breadth of the discipline. Software engineering touches almost every part of a business, so you’re never really siloed.
At Qureight I collaborate with the machine learning and data science teams on complex science problems, but my clinical background as a radiographer means I can also sit with clinical operations and understand their world. This helps when designing systems that make the day-to-day work of radiographers a little easier.
Then there's the work with engineering and DevOps to make sure everything is running smoothly underneath the hood. That combination of technical depth and cross-functional impact is what keeps me hooked.
What’s the most technically demanding pipeline you’ve built?
In terms of time and hidden complexity, it would probably be our medical imaging de-identification pipeline. Its job is to ingest CT scans from different sources, whether that’s site uploads or SFTP transfers, and remove all personally identifiable information and unnecessary DICOM tags and then import the data cleanly into the Qureight platform.
The shiny new pipeline itself was actually the straightforward part. The real challenge came from refactoring the legacy ingestion methods so we could run the old and new pipelines side by side in controlled stages. That allowed us to migrate data safely without disrupting existing studies.
It’s the sort of work that isn’t particularly visible from the outside, but it’s essential if you want to evolve a platform without breaking things along the way.
What motivated you to join Qureight?
When I interviewed, the company was much smaller. I had been working at another startup that was going through a challenging period. They handled it well, but I wasn’t entirely sure I wanted to risk being part of another early-stage company again.
Then I spoke with Qureight’s CEO, Muhunthan Thillai. What convinced me was the openness and honesty about where the company was and where it wanted to go. That sense of transparency felt refreshing and fair, and it’s something that still comes through in the culture today.
How do you think about scale when dealing with imaging pipelines?
Medical imaging data can be quite demanding on infrastructure if you haven’t planned for it properly. A single CT scan can easily exceed 500MB. Multiply that across a global clinical trial with multiple timepoints and you’re moving a very large volume of data before a single GPU has even started processing it.
So, I tend to think about the entire journey: ingestion, storage, job scheduling, GPU utilisation, and output. If there’s a bottleneck anywhere in that chain it quickly becomes a bottleneck everywhere, and that challenge is only increasing.
We’re moving towards importing millions of CT scans for model training, which means the architectural decisions we make now are really laying the groundwork for something much larger.
What’s the most impactful project you’ve worked on?
Being part of the analytics work probably stands out the most. We take volumetric outputs from our lung disease models and convert them into structured reports that pharmaceutical companies can use to understand whether a drug is working, or how a trial is progressing.
Seeing raw imaging data transformed into something that can support decisions in a clinical trial feels like a genuine contribution to the CRO space.
How do you balance research flexibility with production reliability?
The simplest answer is to keep them separate.
Research environments need space to experiment, break things and iterate quickly. Production systems need to be stable, auditable and compliant with GxP requirements. Trying to make one environment serve both purposes usually means you end up doing neither particularly well. Clear separation between experimentation and production allows both to operate effectively.
What excites you about the platform’s evolution over the next 2–3 years?
There's so much more to come. We're not aiming to be just another imaging CRO. The ambition is to be the most advanced and innovative company out there, providing groundbreaking AI science, and that means there are genuinely new problems to solve across multiple areas. For someone who likes variety and hates standing still, it's a pretty exciting place to be.
What advice would you give someone considering joining our software engineering team?
Be comfortable with ambiguity and genuinely curious – about the technology, the data, and the clinical problems you’re helping to solve. This isn’t a role where you can focus only on the technology. Understanding how a pipeline supports a clinical trial, or what a sponsor needs from a model output, is part of the job.
That said, the tech itself is anything but boring. We're constantly experimenting with new tools and frameworks, working across a modern stack: Python, Django, Dagster, AWS, and staying current with where the industry is heading. If you care about AI and where engineering is going, Python is exactly the right place to be investing your time.
Curiosity and adaptability matter a lot as well. We work across disciplines with people from very different backgrounds, so being able to communicate clearly and learn from others is just as important as technical expertise. It also helps to remember that real-world data will always find ways to surprise you!