The Kaiser Permanente settlement was announced January 14, 2026: $556 million to resolve allegations that Kaiser systematically upcoded Medicare Advantage risk-adjustment diagnoses to inflate per-member capitation payments. Combined with Amedisys ($150M, 2024) and several smaller matters, the pattern is now explicit. The DOJ-HHS 2026 Working Group's published priority list names AI-assisted clinical documentation and coding tools as a fraud enforcement target.
What the OIG actually said
The Working Group's framing is worth quoting in spirit: AI tools that "nudge clinicians toward higher-margin codes" without strong evidentiary anchoring are characterized as "digital kickbacks" or upcoding facilitators. The standard the OIG is articulating: an AI tool can suggest code changes, but those suggestions must be tied to documented evidence in the chart, the clinician must explicitly review and accept each one, and the tool must not be marketed to clinicians or to leadership as a "revenue optimization" mechanism.
Why this matters for home health specifically
HHAs sit at the intersection of two enforcement priorities: PDGM case-mix coding (your HIPPS / HHRG pricing depends on accurate diagnosis coding) and OASIS-driven documentation. AI tools that help clinicians "find the right ICD-10" or "improve case-mix" are squarely in OIG scope.
The line the OIG is drawing isn't "no AI." It's: AI that improves coding accuracy (catching errors, surfacing specificity gaps that the documentation supports, flagging mismatches between assessed condition and submitted code) is on the right side. AI that improves coding economics (suggesting more lucrative codes without anchoring to the chart) is on the wrong side.
What an agency should ask its AI vendor
- Does the AI cite documentation evidence for every suggestion? If a tool says "consider J18.9 instead of J18.0" without showing the chart text it found, that's a problem.
- Does the clinician explicitly review and accept each suggestion? Auto-applied coding changes are a red flag.
- Are AI runs audit-logged? Every prompt category, every input fingerprint, every accepted/rejected finding should be logged and queryable.
- How is the tool marketed internally? Sales decks that promise "lift your case-mix by X%" or "find revenue opportunities in your charts" are exactly what the DOJ Working Group called out.
- Are bidirectional flags surfaced? A tool that only ever suggests up-coding (never down) is optimizing for revenue, not accuracy.
What we built for this
Carelytic's PDGM Coding Review was designed against this OIG framing on day one. Every finding cites the chart text the AI found. The clinician sees what the AI saw and explicitly accepts or rejects each suggestion before any code change is written. The tool surfaces downcoding gaps as well as upcoding risks — it's checking accuracy, not optimizing for revenue. Per-tenant prompt customization lets an agency align suggestions to its own coding manual, but we never tune prompts to "lift case mix" — that's the line.
"AI that recommends changes purely to maximize reimbursement is a fraud risk. AI that catches coding errors before they trigger an RTP is a compliance asset. The difference is documentation evidence."
The bigger picture
The OIG isn't anti-AI. The OIG is anti-AI-without-controls. Vendors that built fast and treated compliance as a downstream concern are now in the enforcement crosshairs. Vendors that built compliance-first will be the ones HHAs can adopt with confidence in CY2026.
If you're evaluating an AI coding tool — whether ours or anyone else's — the questions above are the ones to ask. "Show me how the tool ties suggestions to documentation evidence" is the one-question test that separates legitimate coding-accuracy tools from the ones the DOJ Working Group is worried about.
This post is editorial commentary on publicly reported industry news, not legal or compliance advice. For your agency's specific situation, consult counsel and your CMS regional office.