Anatomy of a CRO-to-Sponsor Data Handoff That Goes Wrong

CRO-to-sponsor data transitions are a recurring source of delays in NDA preparation. A sponsor that receives data from a CRO at study completion expecting to proceed directly to submission preparation frequently encounters a set of problems that delay submission by 4–10 weeks. Three failure modes appear consistently across these problematic handoffs. Understanding them in advance is the basis for negotiating a handoff package that avoids them.

Failure Mode 1: Undocumented Derivation Logic

CROs build derivation programs — SAS, R, or Python code that transforms raw EDC data into analysis-ready datasets — over the course of a study. This code works when the CRO's statistical programmers are active on the study. It may not work when transferred to a sponsor whose team was not involved in writing it.

The specific problems that arise:

Hardcoded paths and library references. SAS programs frequently reference library locations using hardcoded paths ("/cro/studies/study123/raw/") or CRO-specific SAS autoexec configurations. These programs cannot run in the sponsor's environment without modification. The number of hardcoded references in a typical Phase III SAS program library ranges from dozens to hundreds, and identifying all of them requires reading every program — not just the top-level driver programs.

Undocumented macro dependencies. CROs build standard macro libraries for common derivation tasks (date calculations, baseline flagging, CT term lookups) that are called from multiple programs. If these macros are not delivered with the study programs, or are delivered without documentation of their function, the study programs cannot run in the sponsor's environment. This is not an edge case — it happens in the majority of CRO handoffs where the CRO's standard library is treated as proprietary infrastructure rather than as a deliverable.

Dataset programs that rely on intermediate datasets not included in the handoff. A SAS ADaM generation program typically processes a chain of intermediate datasets: raw EDC data → SDTM domains → ADaM input datasets → final ADaM datasets. If the handoff package contains the final ADaM datasets but not the intermediate datasets or the programs that generate them, the sponsor can use the final datasets but cannot regenerate them after a protocol amendment or a data correction. This creates a problem for any post-handoff analysis that requires re-running the derivation pipeline.

Prevention: The data management agreement with the CRO should specify that the handoff package includes: all programs with full library dependencies, all intermediate datasets, all macro library code with function documentation, and a tested execution guide that verifies the programs run from scratch on a clean environment. The last requirement — a tested execution guide — is the most valuable and the most commonly omitted. A CRO that can demonstrate the programs run from scratch in an independent environment is delivering a functional transfer, not just a file dump.

Failure Mode 2: Missing or Incomplete define.xml

Define.xml is the machine-readable study data specification that accompanies SDTM and ADaM datasets. It describes every variable in every dataset, including type, length, label, controlled terminology reference, and derivation description. For an NDA submission, define.xml is a required submission component.

The failure mode is not usually a completely missing define.xml — most CROs deliver one. The problem is an incomplete define.xml that does not accurately reflect the actual datasets:

Variable metadata mismatch. Variables in the datasets have different types or lengths than those documented in define.xml, because the datasets were updated after define.xml was generated and the define.xml was not regenerated to match. As described in our article on Pinnacle 21 errors that trigger FDA technical rejections, define.xml variable metadata mismatches are the most common source of FDA TRC notices.

Missing SUPPQUAL documentation. Supplemental qualifier datasets that were added during the study (to capture protocol-specific non-standard variables) are not always reflected in define.xml, particularly if the supplemental variables were added after the initial define.xml was created.

Codelist references without codelist definitions. Define.xml may reference codelists that are not defined within the document itself, or reference the wrong CDISC CT version for the submission context. Sponsor statisticians reviewing the define.xml for submission readiness discover these issues after the handoff, when the CRO team has moved to other studies and has limited bandwidth for corrections.

Prevention: Require the CRO to deliver define.xml generated programmatically from the same metadata source as the final datasets, with a Pinnacle 21 conformance check result demonstrating zero define.xml-related errors. A define.xml that passes a clean Pinnacle 21 check for metadata conformance is the correct acceptance criterion. Do not accept define.xml without a clean conformance check result — the manual effort required to correct a deficient define.xml after handoff is substantially higher than the CRO's effort to produce a correct one before handoff.

Failure Mode 3: Dataset Snapshots Taken at Different Points in Time

A complete Phase III study data package contains dozens of SDTM domains and ADaM datasets that need to represent a consistent view of the study data at the time of database lock. When these datasets are generated at different times — or when some datasets are updated after lock for a specific purpose while others are not — the package becomes internally inconsistent.

The manifestations of this problem:

Subject count inconsistencies across datasets. If ADSL was regenerated after a late protocol deviation adjudication but ADAE and ADLB were not regenerated simultaneously, the safety analysis population in ADSL may not match the safety datasets. This produces cross-dataset inconsistency errors in Pinnacle 21 and information requests during FDA review.

Derived variables with different values in different datasets. TRTP (planned treatment), TRTPN (planned treatment numeric code), and similar variables appear in multiple ADaM datasets. If these were derived by different programmers at different times using slightly different logic or using different data cuts, the values for the same subject in the same period may differ between ADEFF and ADAE.

SDTM datasets from a different data cut than ADaM datasets. When SDTM datasets were generated from data cut A and ADaM datasets were generated from data cut B (following a data correction that affected some records), the ADaM derivations reference SDTM records that do not exist in the delivered SDTM package. Regulatory reviewers working from the SDTM package cannot reproduce the ADaM derivations.

Prevention: The handoff agreement should specify a single "final data package generation date" — a date by which all SDTM and ADaM datasets are regenerated from the same locked source data, as a single coordinated run rather than accumulated deliveries. The CRO should deliver a package generation manifest: a dated record of when each dataset was generated, from which source data version, and using which program version. Any post-lock dataset updates require documented approval and revalidation of the affected datasets.

The Handoff Acceptance Checklist

Based on these three failure modes, a practical handoff acceptance checklist for sponsors receiving data from a CRO:

All programs delivered with dependencies; execution guide tested on a clean environment
Define.xml passes Pinnacle 21 conformance check with zero define.xml-related errors
Package generation manifest confirming all datasets generated from the same locked source data
Cross-dataset subject count verification: ADSL population flags match subject counts in each safety and efficacy dataset
Derived variable consistency check: TRTP, TRTPN, and other shared variables have identical values for the same subjects across datasets
Macro library delivered with function documentation
Audit trail for database lock and unlock events included
SDRG and ADRG drafts included (or commitment to deliver within 30 days)

Running these checks before formally accepting the handoff package takes 2–3 days of CDM analyst effort. The same issues discovered after accepting the handoff typically take 3–6 weeks to resolve because the CRO team has transitioned off the study and corrections require scheduling time rather than being addressed immediately.

MLPipeKit's integration layer includes a handoff intake process that runs automated consistency checks against received datasets and flags the categories of issues described above. For sponsors who regularly receive data from CROs, systematizing the intake check eliminates the discovery phase of the post-handoff correction cycle.

Managing a CRO data handoff or preparing your team's data for submission? Talk to the MLPipeKit team about streamlining the transition process.

Back to Blog