How would you like your data cleaned sir? EDC or CDM?

Electronic Data Capture and Clinical Data Management.   Two classes of technology for helping to optimize clinical trials that have fans and foes.

If we were to describe the two types of solutions to laymen, the response would be…  I don’t really see much of a difference.

Differences do exist, and, I believe they can be fundamental when it comes to the successful execution of clinical trials.

I must admit to having spent most of my career with EDC rather than CDM solutions.   That may be considered a taint on my ability to speak objectively. However, in my defense, I did spend 12 months with an Oracle based CDM – DLB Recorder – or rather eResearch Technologies eDM tool as it was then called.  (It is now re-badged by OmniComm, and has since been enhanced by my former colleague Keith Howells and his team)

During this 1 year, I did learn something of the value of structured data, Batch validation, double data entry and the needs of a typical Data Manager.  I am going to use this as a point of reference.

Clinical Data Management Systems – Challenges

eDM was (is) an Oracle DB/Forms based tool that implements a table generation mechanism to create a trial data repository. In many ways it was a tool for ‘building a database’ – a term that seems odd to EDC firms where the database is not built but ‘used’, but persists with the use of CDM tools, and by CRO’s.   This tool had a method of creating structures of metadata at a Global, Program and Study level, and allowing these structures to be re-used (Cascaded and Promoted).   A very strong mechanism for re-use.   The tool also relied on batch validation primarily. You wrote the data validation routines in PL/SQL (or rather RPL/SQL a syntax that interpreted down to PL/SQL) and then ran these routines in batch mode horizontally across the data. For those familiar with old style database programming – this is nice and clean. It made efficient use of database keys… and it was relatively fast given that the routines could check all of the data with all of the checks in a single run.

In the early 2000’s, it became clear that collecting data directly from the sites was a good thing, and number of bolt on EDC attempts were made with increasing levels of success.  Typically, the front end edit checking was done with one language (VBScript) and the back-end was left to Batch Validation.  However, overall, the results in my mind was a disappointing kludge.   ‘Would you design a system that way’ – ‘no’.  ‘Would you fall into this through circumstances’ – ‘yes’.

The devil was in the detail.  Having mixed edit checking made it harder to see what was really going on, and it required a mixed set of skills to configure it.  I supported work on attempting to get the front end edit checks fire through the back-end checks, but they were not really designed to work that way, so that solution back in 2004 was at best a compromise.

Oracle Clinical has been along a similar road. It used the same underlying technologies (Oracle DB / Forms / JInitiator), and they made 2 or 3 attempts at an EDC front end, finally with RDC before buying InForm (to the chagrin of organizations that had heavily invested in an RDC future.

I have heard it said that it is beneficial to clean data later… rather than at point of entry.  Phooey.  Perform – nope, systems should cope.  Biase – nope, systems should be capable of directing responses to DM as well as sites if appropriate.  I will not go on…

CDM Systems – Capabilities

Some would say the ability to run batch validations across data is a must have for any tool – CDM or EDC.  However, I have only seen it necessary with an EDC if the tool needs this for study amendments. If the EDC system is well designed from the outset, this is largely a non critical issue – (sorry guys… it didn’t need to be as hard).

The primary difference with CDM systems are their ability to effectively present data in a non eCRF mode.  EDC vendors would have you think (and unfortunately CDISC to an extent) that the eCRF Form is the best representation of data.   The CDM guys know this is not necessarily the case.  The data is best presented in a structure that allows it to be worked regardless of the medium it was captured from. (Yes, yes… I know… you can use ETL tools, or custom views to convert the data… but why should I need to?)

The next reason why CDM tools are more DM friendly goes back to the fact that they can more easily say, following a batch run whether the database is ready or not.  EDC systems have this idea that everything belongs to them.  ‘When I see a lock symbol at the study everything is locked’…  That would be nice but it still remains to be rarely the case.  Recognizing the existing of life (data) outside of the EDC system is something EDC vendors struggle with… they might say ‘Data should not exist outside of EDC….  so why should be offer tools to help support it’…

Electronic Data Capture – almost great

EDC systems are great!  They often do about 90% of what they need to do..  However, finding that remaining 10% can be tricky.

Most EDC systems work along similar lines. Edit checks are associated with one of more fields across forms and visits. When the referenced datapoint(s) are changed… the edit check fires, and if the logic is true (or sometimes false) a query, or some other action is triggered.

EDC Integration awareness Challenge

I believe strongly, that all data should be centralized in a good EDC system (and when I say good EDC, I mean an EDC system that has all of the Data Management features necessary).  I also appreciated that often this is not possibly, or not viable.   Assuming this to typically be the case,  it comes down to how nicely an EDC system deals with the situation whereby data is not inside. I think if you were to ask an EDC systems developer what % of  DM data activities are carried out inside EDC versus outside, they would say 90%.  I think in reality the % is often below 50%.  The higher % is not well catered for by EDC.

We also have a challenge when it comes to EDC data cleaning.  Again, I believe that a good EDC system should be capable of cleaning the majority of data through edit checking (intelligent automated, or configured).  However, some data needs to be looked at longitudinally with the support of graphs and charts.  This is where EDC is generally miserable.  We have bolt ons such as JReview, Spotfire or even Business Objects, but they aren’t very good, or, to be fair, they aren’t typically implemented and supported very well.    Most critically – they are uni-directional.  They are not used as part of the EDC workflow.  You cannot go in as a Study lead and ‘see’ which Medical Monitor is still looking across the data, or the data they have looked at.

Next generation EDC/CDM tools will undoubtedly address this failure to integration data review with the system.  Companies that continue to offload this as non core functionality will be trounced by more nimble perceptive providers.

Going back to my point of partial integration.  The Medidata Rave system has an excellent set of Web Services for sending and receiving data & metadata.  Unfortunately,  despite the fact they have existed for almost 5 years,  take-up has been slower than they merit.  To the marketplace, it is one piece of a two piece puzzle.  Companies such as CRF Health have stepped up – they have a nice fully develop eDiary feed tool TrialMax Synapse.  Other companies though tend to want to get away with one-off custom programming jobs that satisfy individual studies, but fail to make things easy or scalable.   Unfortunately, the typical audience for studies here do not appreciate the good from the bad.  With the one-off approach, they are only aware that a). it was not that easy, and b). it was expensive.    A proper integrated approach may be shot down before it really has a chance.

With poor integration, that can leave EDC providing only a partial solution, and making it difficult to satisfy the other parts.

EDC, CDM or something else in the future

I look forward to working with the next generation platforms that are neither EDC or CDM, and, that can leverage the potential to have accurate and timely data directly from patients.   I think the answer is not to satisfy one user demographic, but all users in their role in the support of a successful clinical trial.

Leave a Comment