SAS remains the de facto standard for NDA submission statistical programs, and that is unlikely to change in the short term. But Python-based ADaM derivations are common in early-phase programs, in smaller biotechs that have not standardized on SAS, and in exploratory analysis contexts where Python's data science ecosystem offers advantages. The challenge with Python-based ADaM derivations is not the language — it is that the patterns most commonly used do not handle protocol amendments gracefully, leading to significant rework when amendments require derivation changes.

The Protocol Amendment Problem for ADaM Derivations

A typical Phase II program with an adaptive design or multiple dose cohorts will have two to four protocol amendments over its life. Each amendment can affect ADaM derivations in several ways:

  • Population flag definitions change (a subject who was in-protocol under the original SAP is out-of-protocol under an amended efficacy evaluability criterion)
  • Primary endpoint derivation logic changes (time-to-event calculation reference dates shift, response criteria are redefined)
  • Visit window definitions change (affecting AVISIT derivation and baseline flagging)
  • Concomitant medication or prior therapy coding criteria change

In SAS programs, amendment-related derivation changes typically require updating individual program files. The patterns for managing this are well-established in most CRO and sponsor programming groups. In Python, the equivalent patterns are less standardized, and the result is often derivation code that handles the original protocol well but requires significant surgery to adapt to amendments.

Pattern 1: Protocol Version as an Explicit Input, Not an Assumption

The most common Python ADaM anti-pattern is derivation code that implicitly assumes a single protocol version. Population flag logic is written as: "A subject is in the safety-evaluable population if they received at least one dose of study drug." If an amendment changes this criterion — "at least one dose AND at least one post-baseline safety assessment" — the programmer finds the population flag function and modifies it.

The problem with this pattern is that for studies with pre-amendment and post-amendment subjects, or for integrated submissions that combine data from multiple protocol versions, the single-criterion function is incorrect. You need the old logic applied to subjects enrolled under Protocol Version 1 and the new logic applied to subjects enrolled under Protocol Version 2.

The correction is to make protocol version an explicit parameter of every derivation function that could be affected by an amendment. Population flag functions receive a protocol_version argument. The function body dispatches to the appropriate logic based on the version. Amendment impact is then isolated to adding a new branch in the dispatch logic rather than replacing existing logic.

This pattern requires slightly more upfront structure but makes amendment impact assessment straightforward: identify which functions dispatch on protocol version, add the new branch, run the comparison test suite. The impact is bounded and visible.

Pattern 2: Visit Window Definitions in Configuration, Not Code

ADaM analysis visit (AVISIT) derivation requires mapping relative study days to named analysis visits. Phase II protocols frequently have tight visit windows in the first few post-dose visits and wider windows for longer follow-up visits. These windows are often hardcoded as numeric constants in the derivation code.

When an amendment changes visit windows — a common change in adaptive designs responding to early-phase PK data — visit window constants must be located and updated throughout the derivation codebase. In a Phase II program with ADEFF, ADLB, ADPK, ADAE, and ADCM datasets, visit window logic may appear in three to seven separate places.

The solution is to define visit windows as a configuration structure external to the code:

VISIT_WINDOWS = {
    "v1": {
        "protocol_v1": {"window_start": -1, "window_end": 1, "label": "Day 1"},
        "protocol_v2": {"window_start": -2, "window_end": 1, "label": "Day 1"},
    },
    "v2": {
        "protocol_v1": {"window_start": 12, "window_end": 16, "label": "Week 2"},
        "protocol_v2": {"window_start": 11, "window_end": 17, "label": "Week 2"},
    },
}

Derivation code reads from this configuration. Amendment changes require updating one configuration structure, not hunting for hardcoded values across multiple dataset programs.

Pattern 3: Baseline Flag as a Derived, Testable Object

Baseline flagging — determining which record for each subject and parameter defines the baseline value (ABLFL="Y") — is one of the more complex ADaM derivations and one of the most commonly amended. Baseline definitions change when the protocol amends washout periods, screening visit definitions, or eligible baseline assessment windows.

A pattern that survives amendment well: implement baseline flagging as a separate, fully testable class or function that takes a subject's sorted record list as input and returns the flagged record according to a baseline rule object. The baseline rule object is configurable and version-controlled. Tests operate on the rule object independently of the main ADaM generation pipeline.

This means that when an amendment changes the baseline definition, you:

  1. Create a new baseline rule object representing the amended criterion
  2. Write tests for the new rule against representative subject data patterns
  3. Deploy the new rule for subjects enrolled after the amendment effective date

The old rule remains in the codebase for subjects enrolled before the amendment. Regression to the pre-amendment behavior is testable at any point.

Pattern 4: Imputation Logic Registry

Partial date imputation — determining values for date fields where only partial date information was recorded — is a required step for many SDTM and ADaM derivations, and imputation rules are frequently amended as sponsors gain experience with data collection patterns at their sites.

Rather than implementing imputation logic inline within dataset-specific programs, maintain an imputation registry: a central registry of named imputation rules, versioned, with the derivation logic isolated in a testable module. Dataset programs call the registry by rule name and version. When an amendment changes an imputation rule, the registry adds a new version of the rule. Existing programs that called the old version continue to work correctly; the new rule applies where specified.

This is the pattern that makes the largest difference in amendment-related rework time in our experience. Imputation logic is typically scattered across multiple dataset programs, underdocumented, and intermittently duplicated. A central registry with version control and test coverage reduces amendment impact time from 3–5 days of manual code archaeology to 4–8 hours of targeted updates.

Pattern 5: Cross-Dataset Consistency Checks in the Pipeline

Protocol amendments create consistency risks across ADaM datasets: if ADSL population flags are updated for the amendment but ADEFF is not regenerated with the updated flags, the analysis datasets are internally inconsistent. This category of error is described in more detail in our article on Pinnacle 21 conformance errors, specifically the ADSL population flag inconsistency category.

A Python ADaM pipeline should include cross-dataset consistency checks as a mandatory post-generation step. These checks verify that subject counts in ADSL population flags match subject counts in downstream analysis datasets, that USUBJID populations are consistent across datasets, and that derived variables that appear in multiple datasets (TRTP, TRTA, TRTPN, etc.) have identical values for the same subjects.

Implementing these checks once in a shared utilities module means they run automatically every time the pipeline generates a new dataset version — catching post-amendment inconsistencies before they reach QC review.

On Python Versus SAS for NDA Submissions

For studies targeting NDA submission, the practical answer remains that final submission datasets and statistical programs should be in SAS. FDA review divisions have deep familiarity with SAS output and annotation formats. Python-generated SAS transport files (XPT format) are technically acceptable if they meet the FDA's eSub technical standards, but submission teams regularly report slower reviewer turnaround and more information requests for Python-submitted programs compared to equivalent SAS programs.

The appropriate role for Python in the ADaM context is exploratory derivation development, QC analysis, and early-phase programs that will not result in direct NDA submissions. Organizations that standardize Python for early-phase work and SAS for late-phase and submission work benefit from the productivity of Python in early development while maintaining the regulatory track record of SAS for submissions.

MLPipeKit supports both SAS and Python statistical environments for ADaM dataset handoff. See how it works or speak with our team.

Back to Blog