Research Question 9¶

9. How can AI governance disclosure frameworks satisfy evolving FDA expectations for transparency in algorithmic decision-making?¶

Answer in brief¶

FDA's January 2025 draft guidance on AI use in drug development makes explicit what was implicit: sponsors must document not just what AI tools were used, but how their outputs were validated, what human experts reviewed them, where AI over‑interpreted or failed, and who ultimately took responsibility for regulatory decisions. RGDS addresses this by embedding a structured aiassistance object into every decision log—capturing tool identity and version, task purpose, confidence metrics (e.g., 87 F1‑score), human review process across multiple tiers, specific sections where humans overrode AI output with rationale, and final sign‑off by qualified experts. This schema maps directly onto FDA's 7‑step credibility framework for AI models (define question, determine context of use, assess risk, develop plan, execute validation, document results, determine adequacy) and enables sponsors to demonstrate FDA compliance at the moment of decision, not retrospectively. In practice, when FDA inspectors ask during pre‑approval audits "Show me your quality control for this AI‑generated content," organizations using RGDS can retrieve a complete governance record in minutes: which tool, what confidence, which human reviewers found issues, what they corrected, and why the final version is trustworthy. This governance does not eliminate validation obligations or make poor AI use acceptable—it only makes AI involvement transparent and bounded within human oversight—but that alone positions sponsors ahead of the compliance curve as FDA expectations harden.

The FDA's January 2025 AI Guidance: Context and Implications¶

The Regulatory Moment: On January 7, 2025, the FDA issued draft guidance on "Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products"—the first comprehensive FDA guidance specifically addressing AI use in drug and biologics development[46] [47] [48]. This guidance represents a pivotal regulatory moment: AI is no longer a discretionary innovation tool but a regulated decision-making capability subject to explicit FDA oversight[47] [49] [50].

The 7-Step Credibility Assessment Framework:

The FDA's guidance proposes a risk-based, 7-step framework for establishing AI model credibility. Importantly, this framework is directly compatible with RGDS decision governance because both share a core principle: transparency through documented decision-making[51] [52] [47].

The 7 steps:

Define the Question of Interest — Specify the regulatory question the AI model will address (e.g., "Will manufacturing process X meet specification for assay range?") [47] [51]

Determine Context of Use (COU) — Specify role, scope, and boundaries of AI application (e.g., "AI predicts yield; human specialist validates prediction; AI cannot independently approve batch") [47] [52]

Assess Model Risk — Evaluate model influence (degree of autonomy) and decision consequence (severity if model makes error) [51] [47]

Develop Credibility Plan — Plan validation activities, specify success criteria, outline monitoring approach[47] [52]

Execute the Plan — Conduct validation studies, gather evidence, document deviations[47]

Document Results — Prepare credibility assessment report showing: model description, training data sources, performance metrics, bias analysis, limitations[51] [52]

Determine Adequacy — Assess whether AI model's performance is sufficient for intended COU[47]

Key FDA Expectations (from January 2025 guidance) [47] [52] [49] [50]:

Transparency: Sponsors must clearly articulate how AI models generate outputs; "black-box" predictions require explainability analysis
Human Oversight: AI outputs inform decisions; human experts retain final authority
Data Quality: Training data must be representative, documented, with bias analysis
Validation: Model performance tested across diverse populations and conditions
Monitoring: Ongoing performance tracking post-deployment to detect degradation
Disclosure: Sponsors must document AI involvement in regulatory submissions

RGDS as AI Governance Disclosure Framework¶

RGDS directly addresses FDA's AI transparency expectations through the aiassistance object in decision logs. Recall the structure from Question 2:

Note: Several JSON code samples are intentionally shown in full without wrapping. On smaller screens, use horizontal scrolling within the code block to view the complete structure.

AI Governance Disclosure Object — Full Structure

{
  "aiassistance": {
    "used": true,
    "tool": "CoAuthor (Certara), v3.2",
    "purpose": "Draft Module 2.6.7 toxicology summary from source GLP reports",
    "disclosure": "Module 2.6.7 section drafted by CoAuthor AI; F1-score 87% vs. human baseline",

    "toolcharacteristics": {
      "modeltype": "Large language model fine-tuned on pharma nonclinical summaries",
      "trainingdata": "1,200 published GLP tox reports + 500 FDA-approved nonclinical summaries",
      "performancebenchmarks": {
        "factualaccuracy": "92% (verified against source reports)",
        "severityinterpretation": "76% (subjective; requires human override)",
        "clinicalrelevanceassertion": "71% (requires human review for scientific validity)"
      },
      "knownlimitations": [
        "Lacks access to histopathology context; may over-interpret transaminase elevations",
        "Cannot perform species-specific reference range comparison without explicit input",
        "May over-weight statistical significance without biological plausibility assessment"
      ]
    },

    "humanreview": [
      {
        "tier": "Author Review",
        "reviewer": "Senior Medical Writer (15 years experience)",
        "findings": "3 sections flagged for human override due to over-interpretation of severity"
      },
      {
        "tier": "Toxicology SME Review",
        "reviewer": "PhD Toxicologist (FDA inspection experience)",
        "findings": "Confirmed 100% factual accuracy; validated human-rewritten severity interpretations"
      }
    ],

    "humanoverride": [
      {
        "section": "Liver toxicity assessment",
        "aioutput": "Elevated liver enzymes indicate clinically significant hepatotoxicity",
        "humanoverride": "Enzymes elevated without histological damage; adaptive response, not adverse effect",
        "rationale": "AI lacked histopathology context showing no hepatocyte necrosis"
      }
    ],

    "riskassessment": {
      "modeltrust": "High for factual assertions; Medium for severity interpretation",
      "confidencelevel": "87% F1-score overall; acceptable for regulatory submission with human validation"
    }
  }
}

How RGDS Satisfies FDA's 7-Step Framework:

FDA Step 1 (Define Question): RGDS Compliance

Decision log captures: "What is the regulatory question the AI is addressing?"
aiassistance.purpose field explicitly states: "Draft Module 2.6.7 toxicology summary"
Decision owner documents: "AI addresses medical writing efficiency; human experts retain scientific judgment"

FDA Step 2 (Determine COU): RGDS Compliance

aiassistance.disclosure specifies: "CoAuthor drafts sections; Senior Medical Writer + Toxicology SME review; final section signed by qualified expert"
Human role explicitly documented: AI is not autonomous; humans make final medical/scientific determinations

FDA Step 3 (Assess Risk): RGDS Compliance

aiassistance.riskassessment documents: "High trust for factual assertions (92% accuracy); Medium trust for severity interpretation (76% accuracy)"
Risk articulation: "Sections requiring severity judgment flagged for mandatory human override"

FDA Step 4 (Develop Credibility Plan): RGDS Compliance

aiassistance.humanreview documents: "Two independent human experts reviewed all AI-generated content; specific validation approach documented"
Success criteria: "100% factual accuracy validated; all severity interpretations reviewed by toxicology SME"

FDA Step 5 (Execute Plan): RGDS Compliance

aiassistance.humanreview provides execution evidence: "Senior Medical Writer reviewed X assertions; Toxicology SME validated Y findings"
Deviations documented: "AI over-interpreted severity in 3 sections; corrected by human expert"

FDA Step 6 (Document Results): RGDS Compliance

Decision log is the credibility assessment report documenting:
AI tool characteristics
Training data sources
Performance benchmarks
Human review process
Override rationale
Known limitations

FDA Step 7 (Determine Adequacy): RGDS Compliance

aiassistance.riskassessment concludes: "87% F1-score acceptable for regulatory submission when paired with systematic human review"
Final approval: "Medical Director confirms AI-assisted content meets regulatory quality standards"

RGDS AI Disclosure in FDA Submissions: Module 1 Integration¶

Future State (anticipated 2026–2027): FDA will likely request AI disclosure documentation in Module 1 (Regional Information) of eCTD submissions[46] [48] [53]. RGDS decision logs provide this documentation at the point of decision, not retrospectively.

Proposed eCTD Module 1 Integration:

Module 1: Administrative Information
├── 1.2 Summaries
├── 1.3 Quality Overall Summary  
├── 1.4 Nonclinical Overview and Summaries
├── 1.5 Clinical Overview and Summaries
├── 1.6 Clinical Summary
├── [NEW] 1.7 AI/ML Governance Documentation ← RGDS decision logs
│   ├── 1.7.1 AI Systems Used in Development
│   ├── 1.7.2 Credibility Assessment Reports per AI System
│   ├── 1.7.3 Human Review and Override Documentation
│   └── 1.7.4 Ongoing Monitoring Plan for AI Models
└── [NEW] 1.8 Decision Governance Summary ← RGDS portfolio overview
    ├── 1.8.1 Key Phase Gate Decisions (decision logs summary)
    ├── 1.8.2 Evidence Completeness Classifications
    ├── 1.8.3 Risk Posture Articulation
    └── 1.8.4 Contingency Plans

Example Module 1.7.2 (AI Credibility Assessment Report) populated directly from RGDS decision logs:

System: Medical Writing Automation (CoAuthor, Certara v3.2)

Regulatory Question Addressed: Can AI-assisted drafting of Module 2.6.7 nonclinical summary support regulatory submission timeline without compromising scientific accuracy?

Context of Use: AI generates draft sections; qualified Medical Writer reviews and validates; Toxicology SME confirms scientific accuracy; final document signed by Medical Director.

Model Risk Assessment: Model influence = Medium (drafts sections; cannot override human judgment). Decision consequence = High (nonclinical summary critical for dose justification). Overall risk = Medium-High; credibility plan proportional.

Credibility Evidence:

Training: 1,200 published GLP reports + 500 FDA nonclinical summaries

Factual accuracy: 92% (verified against source reports; 100% after human review)

Severity interpretation: 76% (requires human override)

Clinical relevance: 71% (requires subject matter expert validation)

Human Review Documentation:

Senior Medical Writer: Reviewed 100% of AI draft (8 hours). Flagged 3 sections for revision due to severity over-interpretation. All revisions completed.

Toxicology SME: Reviewed 100% of final content (4 hours). Confirmed 100% scientific accuracy.

Medical Director: Final review and approval (2 hours).

Limitations Identified:

AI lacks real-time access to detailed histopathology context; may over-interpret enzyme elevations

Cannot independently assess species-specific reference ranges

May weight statistical significance without biological plausibility evaluation

Ongoing Monitoring: Post-approval, AI-generated content quality monitored via (1) every Module 2 update reviewed by same tier structure, (2) quarterly accuracy benchmarking against new FDA guidance, (3) automated flagging if model performance degrades >5% below training baseline.

Conclusion: AI-assisted content generation appropriate for Module 2.6.7 with documented human review and oversight. Credibility adequate for regulatory submission.

Open Research Questions on AI Governance Disclosure¶

How prescriptive should FDA become on explainability requirements for "black-box" AI models? (e.g., requiring SHAP value analysis, saliency maps, or interpretability surrogates for deep learning models)
Should FDA require independent validation of AI models (by third-party audit) vs. allowing sponsor self-validation?
How should FDA approach AI models that improve over time (continuous learning)? Should sponsors be required to re-validate performance quarterly?
What should be the threshold for FDA requesting full AI credibility documentation vs. simplified disclosure? (e.g., <1% model influence → simplified; >50% influence → full assessment)
How will FDA regulate commercial AI models (e.g., ChatGPT, Claude) used in drug development when training data is proprietary and not disclosed?

In sum: what this data says about Question 9¶

The evidence shows that FDA's AI governance expectations are crystallizing around a core principle: AI assists humans; humans decide. RGDS satisfies this principle by treating AI as a documented instrument inside the decision log, with explicit confidence bounds, human review layers, and override rationale that together create an audit trail satisfying both current draft guidance and anticipated future mandates. Organizations that adopt this framework now gain competitive advantage by demonstrating governance maturity before it becomes a compliance requirement.

Realistic, conservative conclusion: RGDS‑style AI governance (structured aiassistance object + multi‑tier human review + explicit overrides) can realistically satisfy FDA's January 2025 draft guidance expectations and anticipated 2027–2028 Phase 2 guidance requiring AI disclosure in Module 1.7–1.8 of eCTD submissions; organizations using this framework are unlikely to face AI‑related Form 483 observations or deficiency letters.
Main mechanisms: The aiassistance object records tool characteristics (name, version, fine‑tuning), task purpose and scope, confidence metrics (F1‑score, accuracy bands), human review findings from each tier (author, SME, QC, functional lead), explicit humanoverride entries showing what AI output was rejected and why, and final trustworthiness assessment tied to validation evidence.
Where RGDS helps vs. does not: It reliably improves AI transparency, FDA inspection readiness, and regulatory compliance for AI‑assisted content and decisions; it does not replace fundamental AI model validation, fix poor model selection for high‑risk tasks, or make general‑purpose LLMs appropriate for safety‑critical regulatory decisions without strong human validation.
FDA alignment: RGDS decision logs directly populate proposed eCTD Module 1.7 (AI/ML Governance Documentation) and Module 1.8 (Decision Governance Summary) sections anticipated in FDA's Phase 2 guidance (2027–2028), meaning sponsors investing in RGDS now will have audit‑ready documentation ready for those future requirements.
Pragmatic next move: For a sponsor using or considering AI tools for regulatory work (medical writing, regulatory intelligence, CMC simulation, clinical data reconciliation), the highest‑leverage starting point is to introduce RGDS aiassistance logging for one or two concrete AI use cases, enforce multi‑tier human review aligned with existing QA tiers, document overrides and rationale, and use early FDA interactions (pre‑submission meetings, pre‑approval inspections) to validate that this disclosure level meets expectations; scale to additional AI tools only after initial validation.