The LA Freeway model of clinical monitoring

A freeway paradigm helps explain why onsite visits by study monitors don’t work and helps us plan and implement an effective system for protocol compliance monitoring of all sites, all data, all the time that saves time and money.

But first – let’s consider some  special aspects of clinical trial data:

Clinical trial data is highly dimensional data.

Clinical trial data is not “big data” but it is highly-dimensional in terms of variables or features of a particular subject.

Highly dimensional data is often found in biology;  a common example of highly dimensional data in biology is gene sequencer output. There are often tens of thousands of genes (features), but only tens of hundreds of samples.

In medical device clinical trial, there may be thousands of features but only tens of subjects.

Traditional protocol compliance monitoring uses on-site visits  and SDV  (source document verification) that requires visual processing of the information at the “scene”.   Since the amount of visual information available at the scene is enormous,  a person processes only a subset of the scene.

Humans focus on the interesting facets of a scene ignoring the rest. This is explained by selective attention theory.

Selective attention.

Selective attention is a cognitive process in which a person attends to one or a few sensory inputs while ignoring the other ones.

Selective attention can be likened to the manner by which a bottleneck restricts the flow rate of a fluid.

The bottleneck doesn’t allow the fluid to enter into the body of the bottle all at once; rather, it lets the fluid to enter in certain amounts depending on the flow rate, until all of it has entered the bottle’s body.

Selective attention is necessary for us to attend consciously to sensory stimuli in such a way that we will not experience sensory overload. See the article in Wikipedia on Attenuation theory.


How to swim in the cold water of hybrid trials

(The water in the pool in Cascais in October was < 12 degrees)

There are over 135 COVID-19 vaccine candidates in the pipeline at the time of writing this post

Although there are always personal, political and emotional preferences, we should probably not select a clinical data management platform like we choose villas in exotic vacation spots.

The reality is that simplification of the protocol and the study conduct will have a much bigger impact on time to completion than choice of a particular eClinical platform.

COVID-19 has much wider ramifications for the clinical trial industry beyond the next 12-18 months.

COVID-19 signals a driving concern to use technology to acquire valid data from clinical trials in the fastest way possible.

Virtual clinical trials

One approach is to recruit and collect data from patients online in what is called a virtual trials model.

There is a gigantic amount of buzz on virtual clinical trials because of COVID-19. The idea is to go direct to patient with digital tools and engagement. This is theoretically supposed to cut out all the friction for recruitment and and overhead of research sites.

With all the buzz on virtual trials, no one seems to know how many virtual trials are actually being conducted. (There is a well-known axiom that technology adoption is inversely proportional to PR).

It may be that in the future, fewer than 5% of trials would remain all paper. Maybe 5%, will go fully virtual.

Hybrid clinical trials

The action will be in the middle in “hybrid”. Trials are moving away from paper, “virtualizing” a process here, a step there. I am looking to see how much of this is taking place, and to what extent COVID-19 accelerates it, and which processes and steps are virtualizing the fastest.

Based on observing 12 hybrid trials running right now on the platform, today, I can assert that hybrid trials are complex distributed systems with a whole new set of challenges that make the old site/investigator-centric model look like a stroll in the park.

Connected medical device vendors understand the value of merging patient, clinical and device data into real-time data streams.  Once you have a real-time data stream, you can use real-time automated monitoring.

However, bringing merged real-time streams of patient, device and clinical investigator data into the domain of mainstream drug trials is hugely challenging because the data sources are highly heterogenous.

Combining patient outcome reporting with mobile apps, passive data collection from wearables and phones and site monitor data entry creates a complex distributed system of data sources.   Such a complex distributed system cannot possibly be monitored by assuming that there is a single paper source document. That assumption is no longer valid.

Observability of events

We need to correlate and group events across different systems, applications and users. We need to achieve low level observability of a patient while attributing it to top level cohorts and sites in the study.

This is especially difficult since the different EDC systems and digital appliances were not designed for monitoring.

You can see a presentation here on using pivot tracing for dynamic causal monitoring of distributed systems. This is work done by Jonathan Mace while he was a PhD student at Brown.   The work was done on HDFS but the concepts are applicable to virtual and hybrid clinical trials.  is a cloud platform that automates detection of deviations in clinical trials using these general concepts.

Flask provides an immediate picture of what’s going on. The picture can then be grouped by patient, physician, principal investigator, project managers all the way up to the  VP Clinical and CEO. In political terms, you might say that  we  democratize the process of observing clinical trials using metrics and automation.

Automation can be used to speed delivery of valid data to decision makers in clinical trials. The basic idea is to monitor with alerts.  Some of the ideas from the talk:

Alerts are metrics over/under a threshold

Alerts are urgent, important, actionable and real
In the world of alerts, symptoms are better than causes
Validate: Are we calculating the right metric?
Verify: Are we calculating the metric right?

Do it Fast. fast. fast.

You can see my talk here: Automated Detection and response for clinical trials.

Originally published on

Home alone and in a clinical trial

1 in 7 American adults live alone

What is atherothrombosis?

If you are age 40 to 60 and live alone, your CV system is at high risk

Can social networking mitigate the risk of living alone?

Social networks detach people from meaningful interactions with one another

We expect more from technology and less from each other

Digital technology enables real interactions with real primary care teams.

This was first published on Medium at Can digital mitigate the risk of living alone

Hack back the user interface for clinical trials

As part of my campaign for site-coordinator and study-monitor centric clinical trials; we first need to understand how to exploit a vulnerability in human psychology.

As a security analyst, this is the way I look at things – exploits of vulnerabilities.

In 2007, B.J. Fogg, founder and director of the Stanford Behavior Design Lab taught a class on “mass interpersonal persuasion. A number of students in the class went on to apply these methods at Facebook, Uber and Instagram.

The Fogg behavior model says that 3 things need to happen simultaneously to initiate a behavior: Motivation (M), ability (A) and a trigger (T).

When we apply this model to patient-centric trials, we immediately understand why patient-centricity is so important.

Motivation – the patient wants therapy (and may also be compensated for her participation).

Ability is facility of action. Make it easy for a patient to participate and they will not need a high energy level to perform the requisite study tasks (take a pill, operate a medical device, provide feedback on a mobile app).

Without an external trigger, the desired behavior (participating in the study in a compliant way) will not happen.  Typically, text messages are used to remind the patient to do something (take treatment or log an ePRO diary).  A reminder to log a patient diary is a distraction; when motivation and ability exceed the trigger energy level, then the patient will comply. If the trigger energy level is too high (for example – poor UX in the ePRO app) then the patient will not comply.    Levels of protocol adherence will be low.

The secret is designing the study protocol and the study UX so that the reminder trigger serves the patient and not the patient serving the system.

People-centric clinical trials

Recall – that any behavior ( logging data, following up) requires 3 things: motivation, ability and a trigger.

A site coordinator can be highly motivated. She may be well trained and able to use the EDC system even the UX is vintage 90s.

But if the system doesn’t give anything back to her; reminders to close queries or to follow-up are just distractions.

The secret is designing the study protocol and the study UX so that the reminder trigger serves the CRC and CRA and not the CRC, CRA are serving the system.

When we state the requirement as a trigger serving the person – we then understand that it is not about patient-centricity.

It is about people-centricity.



A better tomorrow for clinical trials

A better tomorrow – Times of crisis usher in new mindsets

By David Laxer. Spoken from the heart.

In these trying days, as we adjust to new routines and discover new things about ourselves daily, we are also reminded that the human spirit is stronger than any pandemic and we have survived worse.

And because we know we’re going to beat this thing, whether in 2 weeks or 2 months, we also know that we will eventually return to normal, or rather, a new normal.

In the meantime, the world is showing a resolve and a resilience that gives us much room to hope for a better tomorrow for developing new therapeutics.

However, these days have got us wondering how things might have looked if clinical trials were conducted differently. It’s a well-known fact that clinical trials play an integral role in the development of new, life-saving drugs, but by the time they get approved by the FDA it takes an average of 7.5 years and anywhere between $150m-2bn per drug.

Reasons for failure

Many clinical studies still use outdated methods for data collection and verification: they still use a fax for crying out. They continue to manually count leftover pills in bottles, and still rely on patients’ diary entries to ensure adherence.

Today, the industry faces new challenges to recruit enough participants as COVID-19 forces people to stay at home and out of research hospital sites. 

Patient drop-outs, adverse events and delayed recording of adverse events  are still issues for pharma and medical device companies conducting clinical research.  The old challenge of creating interpretable data to examine safety and efficacy of new therapeutics remain.

The Digital Revolution:

As hard as it is to believe, the clinical trial industry just might be the last major industry to undergo digital transformation..

As every other aspect of modern life has already been digitized, from banking to accounting to education now, more than ever, is the time to accelerate the transition of this crucial process, especially as we are painfully reminded of the need for finding a vaccine.  Time is not a resource we can waste any longer.

Re-imagining the future

When we created FlaskData we were primarily driven by our desire to disrupt the clinical trial monitoring paradigm  and bring it into the 21st century — meaning real-time data collection and automated detection and response. From the beginning we found fault in the fact that clinical trials were, and still are overly reliant on manual processes  and this causes unacceptable delays in bringing new and essential drugs and devices to market. These delays, as we are reminded during these days, not only cost money and time, but ultimately they cost us lives.

To fully achieve this digitization it’s important to create a secure cloud service that can accelerate the entire  process, and provide sponsors with an immediate picture and interpretable data without having to spend 6-12 months cleaning data.  This is achieved with real-time data collection, automated detection and response and an open API that enables any healthcare application to collect clinical-trial-grade data and assure patient adherence to the clinical protocol.

Our Promise:

It didn’t take a virus to make us want to deliver new medical breakthroughs into the hands that need them most, but it has definitely made us double down on our resolve to see it through. The patient needs to be placed at the center of the clinical research process and we are tasked to reduce the practical, geographical and financial barriers to participation. The end result is a more engaged patient, higher recruitment and retention rates, better data and reduced study timelines and costs.

The Need For Speed

As the world is scrambling to find a vaccine for Corona, we fully grasp 2 key things: 1) Focus on patients and 2) Provide clinical operations teams with the ability to eliminate inefficiencies and move at lightning speed. In these difficult times, there is room for optimism as it is crystal clear, just how important it is to speed up the process.


Social Distancing

In this period of social distancing, we can only wonder about the benefits of conducting clinical trials remotely. We can only imagine how many trials have been rendered useless as patients, reluctant to leave their houses have skipped the required monitoring, have forgotten to take their pills and their diary entries have gotten lost amidst the chaos.

With a fully digitized process for electronic data collection, social distancing would have no effect on the clinical trial results.

About David Laxer

David is a strategist and story-teller. He says it best – “Ultimately, when you break it down, I am a storyteller and a problem solver. The kind that companies and organizations rely on for their brand DNA, culture and long-lasting reputation”.


Reach out to David on LinkedIn

7 tips for an agile healthtech startup

It’s a time when we are all remote-workers.   Startups looking for new ways to add value to customers.  Large pharmas looking for ways to innovate without breaking the system.

To quote Bill Gates from 25 years ago. Gates was asked how Microsoft can compete in enterprise software when they only had business-unit capabilities.  Gates was quoted as saying that large enterprises are a collection of many business units, so he was not worried.

The same is true today – whether you are a business unit in Pfizer or a 5-person healthtech startup

Here are 7 tips for innovation in healthcare

1. One person in the team will be a technical guru, let’s call him/her the CTO. Don’t give the CTO admin access to AWS.  He / she should not be fooling around with your instances. Same for sudo access to the Linux machines.
2. Make a no rule – No changes 1 hour before end of day. No changes Thursday/Friday
3. Security – think about security before writing code.  Develop a threat model first. I’ve seen too many startups get this wrong.   Also big HMOs get it wrong.
4. Standards – standardize on one dev stack – listen to the CTO but do not try new things. If a new requirement comes up, talk about it, be critical, sleep on it.    Tip – your CTO’s first inclination will be to write code – this is not always the best strategy – the best is not writing any code at all.  You may be tempted to use some third-party tools like Tableaux – be very very careful.   The licensing or the lack of multi-tenancy may be a very bad  fit for you – so always keep your eye on your budget and business model.
5. Experiment – budget for experimentation by the dev team. Better to plan an experiment and block out time/money for it and fail than get derailed in an unplanned way.  This will also keep things interesting for the team and help you know that they are not doing their own midnight projects.
6. Minimize – always be removing features.  Less is more.
7. CAPA – (corrective and preventive action) – Debrief everything.  Especially failures. Document in a Slack channel and create follow-up actions (easy in slack – just star them).

Streaming clinical trials in a post-Corona future

Last week, I wrote about using automated detection and response technology to mitigate the next Corona pandemic.

Today – we’ll take a closer look at how streaming data fits into virtual clinical trials.

Streaming – not just for Netflix

Streaming real-time data and automated digital monitoring is not a foreign idea to people quarantined at home during the current COVID-19 pandemic.   Streaming: We are at home and watching Netflix.   Automated monitoring: We are now using digital surveillance tools based on mobile phone location data to locate and track people who came in contact with other CORONA-19 infected people.

Slow clinical trial data management. Sponsors flying blind.

Clinical trials use batch processing of data. Clinical trials currently do not stream patient / investigator signals in order to manage risk and ensure patient safety.

The latency of batch processing in clinical trials is something like 6-12 months if we measure the time from first patient in to the time a bio-statistician starts working on an interim analysis.

Risk-based monitoring for clinical trials uses batch processing to produce risk profiles of sites in order to prioritize another batch process – namely site visits and SDV (source data verification).

The latency of central CRO monitoring using RBM ranges wildly from 1 to 12 weeks. This is reasonable considering that the design objective of RBM is to prioritize a batch process of site monitoring that runs every 5-12 weeks.

In the meantime – the study is accumulating adverse events and dropping patients to non-compliance and the sponsor is flying blind.

Do you think 2003 vintage data formats will work in 2020 for Corona virus?

An interesting side-effect of batch processing for RBM is use of SDTM for processing data and preparing reports and analytics.

SDTM provides a standard for organizing and formatting data to streamline processes in collection, management, analysis and reporting. Implementing SDTM supports data aggregation and warehousing; fosters mining and reuse; facilitates sharing; helps perform due diligence and other important data review activities; and improves the regulatory review and approval process. SDTM is also used in non-clinical data (SEND), medical devices and pharmacogenomics/genetics studies.

SDTM is one of the required standards for data submission to FDA (U.S.) and PMDA (Japan).

It was never designed nor intended to be a real-time streaming data protocol for clinical data. It was first published in June 2003. Variable names are limited to 8 characters (a SAS 5 transport file format limitation).

For more information on SDTM, see the 2011 paper by Fred Woods describing the challenges to create SDTM datasets.   One of the surprising challenges is data/time formats – which continue to stymie biostats people to this day.  See Jenya’s excellent post on the importance of collecting accurate date-time data in clinical trials. We have open, vendor-neutral standards and JavaScript libraries to manipulate dates. It is a lot easier today than it was in June 2003.

COVID-19 – we need speed

In a post COVID-19 era, site monitoring visits are impossible and patients are at home. Now, demands for clinical trials are outgrowing the batch-processing paradigm.   Investigators, nurses, coordinators and patients cannot wait for the data to be converted to SDTM, processed in a batch job and sent to a data manager.  Life science sponsors need that data now and front-line teams with patients need an immediate response.

Because ePRO, EDC and wearable data collection are siloed (or waiting for batch file uploads using USB connection like Phillips Actiwatch or Motionwatch), the batch ETL tools cannot process the data.  To place this in context; the patient has to come into the site, find parking, give the watch to a site coordinator, who needs to plug the device into USB connection, upload the data and then import the data to the EDC who then waits for an ETL job converting to SDTM and processing to an RBM system.

Streaming data for clinical research in a COVID-19 era

In order to understand the notion of streaming data for clinical research in a COVID-19 era, I drew inspiration and shamelessly borrowed the graphics from Bill Scotts excellent article on Apache Kafka – Why are you still doing batch processing? “ETL is dead”.

Crusty Biotech

The Crusty biotech company have developed an innovative oral treatment called Crusdesvir for Corona virus.   They contract with a site, Crusty Kitchen to test safety and efficacy of Crusdesvir. Crusty Kitchen has one talented PI and an efficient site team that can process 50 patients/day.

The CEO of Crusty Biotech decides to add 1 more site, but his clinical operations process is built for 1 PI at a time who can perform the treatment procedure in a controlled way and comply with the Crusdesvir protocol.  It’s hard to find a skilled PI and site team but the finally finds one and signs a contract with them.

Now they need to add 2 more PI’s and sites and then 4.   With the demand to deliver a working COVID-19 treatment, Crusty Biotech needs to recruit more sites who are qualified to run the treatment.    Each site needs to recruit (and retain more treatments).

The Crusty Biotech approach is an old-world batch workflow of tasks wrapped in a rigid environment. It is easy to create, it works for small batches but it is impossible to grow (or shrink) on demand. Scaling requires more sites, introduces more time into the process, more moving parts, more adverse events, less ability to monitor with site visits and the most crucial piece of all – lowers the reliability of the data, since each site is running its own slow-moving, manually-monitored process.

Castle Biotech

Castle Biotech is a competitor to Crusty Biotech – they also have an anti-viral treatment with great potential.    They decided to plan for rapid ramp-up of their studies by using a manufacturing process approach with an automated belt delivering raw materials and work-in-process along a stream of work-stations.   (This is how chips are manufactured btw).

Belt 1:Ingredients, delivers individual measurements of ingredients

Belt 1 is handled by Mixing-Baker, when the ingredients arrive, she knows how to mix the ingredients, then put mixture onto Belt 2.

Belt 2:Mixture, delivers the perfectly whisked mixture.

Belt 2 is handled by Pan-Pour-Baker, when the mixture arives, she can delicately measure and pour mixture into the pan, then put pan onto Belt 3.

Belt 3:Pan, delivers the pan with exact measurement of mixture.

Belt 3 is handled by Oven-Baker, when the pan arrives, she puts the pan in the oven and waits the specific amount of time until it’s done. When it is done, she puts the cooked item on the next belt.

Belt 4:Cooked Item, delivers the cooked item.

Belt 4 is handled by Decorator, when the cooked item arrives, she applies the frosting in an interesting and beautiful way. She then puts it on the next belt.

Belt 5:Decorated Cupcake, delivers a completely decorated cupcake.

We see that once the infrastructure is setup, we can easily add more bakers (PI’s in our clinical trial example) to handle more patients.  It’s easy to add new cohorts, new designs by adding different types of ‘bakers’ to each belt.

How does cupcake-baking relate to clinical data management?

The Crusty Biotech approach is old-world batch/ETL – a workflow of tasks set in stone. 

It’s easy to create. You can start with a paper CRF or start with a low-cost EDC. It works for small numbers of sites and patients and cohorts but it does not scale.

However, the process breaks down when you have to visit sites to monitor the data and do SDV because you have a paper CRF.  Scaling the site process requires additional sites, more data managers, more study monitors/CRAs, more batch processing of data, and more round trips to the central monitoring team and data managers. More costs, more time and 12-18 months delay to deliver a working Corona virus treatment.

The Castle Biotech approach is like data streaming. 

Using a tool like Apache Kafka, the belts are topics or a stream of similar data items, small applications (consumers) can listen on a topic (for example adverse events) and notify the site coordinator or study nurse in real-time.   As the flow of patients in a study grows, we can add more adverse event consumers to do the automated work.

Castle Biotech is approaching the process of clinical research with a patient-centric streaming and digital management model, which allows them to expand the study and respond quickly to change (the next pandemic in Winter 2020?).

The moral of the story – Don’t Be Krusty.



So what’s wrong with 1990s EDC systems?

Make no doubt about it, the EDC systems of 2020 are using a 1990’s design. (OK – granted, there are some innovators out there like ClinPal with their patient-centric trial approach but the vast majority of today’s EDC systems, from Omnicomm to Oracle to Medidata to Medrio are using a 1990’s design. Even the West Coast startup Medable is going the route of if you can’t beat them join them and they are fielding the usual alphabet soup of buzz-word compliant modules – ePRO, eSource, eConsent etc. Shame on you.

Instead of using in-memory databases for real-time clinical data acquisition, we’re fooling around with SDTM and targeted SDV.

When in reality – SDTM is a standard for submitting tabulated results to regulatory authorities (not a transactional database nor an appropriate data model for time series).  And even more reality – we should not be doing SDV to begin with – so why do targeted SDV if not to perpetuate the CRO billing cycle.

Freedom from the past comes from ridding ourselves of the clichés of today.


Personally – I don’t get it. Maybe COVID-19 will make the change in the paper-batch-SDTM-load-up-the-customer-with-services system.

So what is wrong with 1990s EDC?

The really short answer is that computers do not have two kinds of storage any more.

It used to be that you had the primary store, and it was anything from acoustic delay-lines filled with mercury via small magnetic dougnuts via transistor flip-flops to dynamic RAM.

And then there were the secondary store, paper tape, magnetic tape, disk drives the size of houses, then the size of washing machines and these days so small that girls get disappointed if think they got hold of something else than the MP3 player you had in your pocket.

And people still program their EDC systems this way.

They have variables in paper forms that site coordinators fill in on paper and then 3-5 days later enter into suspiciously-paperish-looking HTML forms.

For some reason – instead of making a great UI for the EDC, a whole group of vendors gave up and created a new genre called eSource creating immense confusion as to why you need another system anyhow.

What the guys at Gartner euphemistically call a highly fragmented and non-integrated technology stack.
What the site coordinators who have to deal with 5 different highly fragmented and non-integrated technology stacks call a nightmare.



Now we have variables in “memory” and move data to and from “disk” into a “database”.

I like the database thing – where clinical people ask us – “so you have a database”. This is kinda like Dilbert – oh yeah – I guess so. Mine is a paradigm-shifter also.

Anyhow, today computers really only have one kind of storage, and it is usually some sort of disk, the operating system and the virtual memory management hardware has converted the RAM to a cache for the disk storage.

The database process (say Postgres) allocate some virtual memory, it tells the operating system to back this memory with space from a disk file. When it needs to send the object to a client, it simply refers to that piece of virtual memory and leaves the rest to the kernel.

If/when the kernel decides it needs to use RAM for something else, the page will get written to the backing file and the RAM page reused elsewhere.
When Postgres next time refers to the virtual memory, the operating system will find a RAM page, possibly freeing one, and read the contents in from the backing file.

And that’s it.

Virtual memory was meant to make it easier to program when data was larger than the physical memory, but people have still not caught on.
And maybe with COVID-19 and sites getting shut-down; people will catch on that a really nifty user interface for GASP – THE SITE COORDINATORS and even more AMAZING – a single database in memory for ALL the data from patients, investigators and devices.

Because at the end of the day – grandma knows that there ain’t no reason not to have a single data model for everything and just shove it into virtual memory for instantaneous, automated DATA QUALITY, PATIENT SAFETY AND RISK ASSESSMENT in real-time.

Not 5-12 weeks later for research site visit or a month later after the data management trolls in the basement send back some reports with queries and certainly not spending 6-12 months cleaning up unreliable data due to the incredibly stupid process of paper to forms to disk to queries to site visits to data managers to data cleaning.

I love being a CRA, but the role as it exists today is obsolete.

I think that COVID-19 will be the death knell for on-site monitoring visits and SDV.    Predictions for 2020 and the next generation of clinical research – mobile EDC for sites, patients and device integration that just works.

I’m neither a clinical quality nor a management consultant. I cannot tell a CRO not to bill out hours for SDV and CRA travel and impact study budget by 25-30% and delay results by 12-18 months.

Nope.   I’m not gonna tell CROs what to do.    Darwin will do that for me.

I develop and support technology to help life science companies go faster to market.  I want to save lives by shortening time to complete clinical trials for COVID-19 vaccine and treatments by 3-6 months.

I want to provide open access to research results – for tomorrow’s pandemic.

I want to  enable real-time data sharing.

I want to enable participants in the battle with COVID-19 to share real-world / placebo arm data, making the fight with COVID-19 more efficient and collaborative and lay the infrastructure for the next wave of pandemics.

I want to provide real-time data collection for hospitals, patients and devices.  Use AI-driven detection of protocol violations and automated response to enable researchers to dramatically improve data reliability, allowing better decision making and improving patient safety.

The FDA (a US government regulatory bureaucracy) told the clinical trial industry to use e-Source 10 years ago and to use modern IT .  If FDA couldn’t then maybe survival of the fittest and COVID-19 well do the job.

FDA’s Guidance for Industry: Electronic Source Data in Clinical Investigations, says, in part:
“Many data elements (e.g., blood pressure, weight, temperature, pill count, resolution of a symptom or sign) in a clinical investigation can be obtained at a study visit and can be entered directly into the eCRF by an authorized data originator. This direct entry of data can eliminate errors by not using a paper transcription step before entry into the eCRF. For these data elements, the eCRF is the source. If a paper transcription step is used, then the paper documentation should be retained and made available for FDA inspection.”

I loved this post by Takoda Roland on the elephant in the room.

Source data validation can easily account for more than 80% of a monitor’s time. You go on site (or get a file via Dropbox). Then you need  to page through hundreds of pages of source documents to ensure nothing is missing or incomplete. Make sure you check the bare minimum amount of data before you need to rush off to catch my flight, only to do it all again tomorrow in another city, I am struck with this thought: I love being a CRA, but the role as it exists today is obsolete.

Opinion: A Futurist View on the Use of Technology in Clinical Trials


Using automated detection and response technology mitigate the next Corona pandemic

What happens the day after?   What happens next winter?

Sure – we must find effective treatment and vaccines.  Sure – we need  to reduce or eliminate the need for on-site monitoring visits to hospitals in clinical trials.  And sure – we need to enable patient monitoring at home.

But let’s not be distracted from 3 more significant challenges:

1 – Improve patient care

2 – Enable real-time data sharing. Enable participants in the battle with COVID-19 to share real-world / placebo arm data, making the fight with COVID-19 more efficient and collaborative.

3- Enable researchers to dramatically improve data reliability, allowing better decision making and improving patient safety.

Clinical research should ultimately improve patient care.

The digital health space is highly fragmented (I challenge you to precisely define the difference between patient engagement apps and patient adherence apps and patient management apps).  There are over 300 digital therapeutic startups. We are lacking a  common ‘operating system and  there is a dearth of vendor-neutral standards that would enable interoperability between different digital health systems mobile apps and services.

By comparison – clinical trials have a well-defined methodology, standards (GCP) and generally accepted data structures in case report forms.  So why do many clinical trials fail to translate into patient benefit?

A 2017 article by Carl Heneghan, Ben Goldacre & Kamal R. Mahtani “Why clinical trial outcomes fail to translate into benefits for patients”  (you can read the Open Access article here) states the obvious: that the objective of clinical trials is to improve patients’ health.

The article points at  a number of serious  issues ranging from badly chosen outcomes, composite outcomes, subjective outcomes and lack of relevance to patients and decision makers to issues with data collection and study monitoring.

Clinical research should ultimately improve patient care. For this to be possible, trials must evaluate outcomes that genuinely reflect real-world settings and concerns. However, many trials continue to measure and report outcomes that fall short of this clear requirement…

Trial outcomes can be developed with patients in mind, however, and can be reported completely, transparently and competently. Clinicians, patients, researchers and those who pay for health services are entitled to demand reliable evidence demonstrating whether interventions improve patient-relevant clinical outcomes.

There can be fundamental issues with study design and how outcomes are reported.

This is an area where modeling and ethical conduct intersect;  both are 2 critical areas.

Technology can support modeling using model verification techniques (used in software engineering, chip design, aircraft and automotive design).

However, ethical conduct is still a human attribute that can neither be automated nor replaced with an AI.

Let’s leave modeling to the AI researchers and ethics to the bioethics professionals

For now at least.

In this article, I will take a closer look at 3 activities that have a crucial impact on data quality and patient safety. These 3 activities are orthogonal to the study model and ethical conduct of the researchers:

1 – The time it takes to detect and log protocol deviations.

2 – Signal detection of adverse events (related to 1)

3 – Patients lost to follow-up (also related to 1)

Time to detect and log deviations

The standard for study monitors is to visit investigational sites once ever 5-12 weeks.   A Phase IIB study with 150 patients that lasts 12 months would typically have 6-8 site visits (which incidentally, cost the sponsor $6-8M including the rewrites, reviews and data management loops to close queries).

Adverse events

As reported by Heneghan et al:

A further review of 11 studies comparing adverse events in published and unpublished documents reported that 43% to 100% (median 64%) of adverse events (including outcomes such as death or suicide) were missed when journal publications were solely relied on [45]. Researchers in multiple studies have found that journal publications under-report side effects and therefore exaggerate treatment benefits when compared with more complete information presented in clinical study reports [46]

Loss of statistical significance due to patients lost to follow-up

As reported by Akl et al in  “Potential impact on estimated treatment effects of information lost to follow-up in randomized controlled trials (LOST-IT): systematic review” (you can see the article here):

When we varied assumptions about loss to follow-up, results of 19% of trials were no longer significant if we assumed no participants lost to follow-up had the event of interest, 17% if we assumed that all participants lost to follow-up had the event, and 58% if we assumed a worst case scenario (all participants lost to follow-up in the treatment group and none of those in the control group had the event).

Real-time data

Real-time data (not data collected from paper forms 5 days after the patient left the clinic) is key to providing an immediate picture and assuring interpretable data for decision-making.

Any combination of data sources should work – patients, sites, devices, electronic medical record systems, laboratory information systems or some of your own code. Like this:

Mobile eSource mobile ePRO medical device API

Signal detection

The second missing piece is signal detection for safety, data quality and risk assessment of patient, site and study,

Signal detection should be based upon the clinical protocol and be able to classify the patient into 1 of 3 states: complies, exception (took too much or too little or too late for example) and miss (missed treatment or missing data for example).

You can visualize signal classification as putting the patient state into 1 of 3 boxes like this:Automated response

One of the biggest challenges for sponsors running clinical trials is delayed detection and response.   Protocol deviations are logged 5-12 weeks (and in a best case 2-3 days) after the fact.   Response then trickles back to the site and to the sponsor – resulting in patients lost to follow-up and adverse events that were recorded long after the fact..

If we can automate signal detection then we can also automate response and then begin to understand the causes of the deviations.    Understanding context and cause is much easier when done in real-time.        A good way to illustrate is to think about what you were doing today 2 weeks ago and try and connect that with a dry cough, light fever and aching back.   The symptoms may be indicative of COVID-19 but y0u probably don’t remember what you were doing and  with whom you came into close contact.     The solution to COVID-19 back-tracking is use of digital surveillance and automation. Similarly, the solution for responding to exceptions and misses is to digitize and automate the process.

Like this:

Causal flows of patient adherence


In summary we see 3 key issues with creating meaningful outcomes for patients:

1 – The time it takes to detect and log protocol deviations.

2 – Signal detection of adverse events and risk (related to 1)

3 – Patients lost to follow-up (also related to 1)

These 3 issues for creating meaningful outcomes for patients can be resolved with 3 tightly integrated technologies:

1 – Real-time data acquisition for patients, devices and sites (study nurses, site coordinators, physicians)

2 – Automated detection

3 – Automated response