The Backup That Hung Forever on OneDrive

The backup was “running.” It had been running for nine hours.

A client’s nightly Kopia job pointed at a user’s documents folder. Normally it finished in twenty minutes. This morning it was still chewing: no error, no progress, just a cursor blinking next to a snapshot that had hashed maybe a few hundred files and then gone silent.

No CPU. No disk I/O to speak of. No network throughput. Just a process sitting there, perfectly calm, doing absolutely nothing while claiming to be busy.

That’s the worst kind of hang. A crash leaves a body. This left a process that would happily wait until the heat death of the universe.

The investigation

First instinct: blame the repository. Maybe the backend was wedged, maybe a lock was stuck. Nope. A fresh kopia snapshot create against a different folder ran clean in seconds.

So it wasn’t Kopia. It was this folder.

I watched the process with Resource Monitor and caught it: every time the scan touched a particular file, the thread parked in a wait state. Not reading. Not erroring. Waiting on the filesystem to hand back bytes that never came.

Then I actually looked at the folder.

┌────────────────────────────────────────────────┐
│  C:\Users\user\Documents  (OneDrive-synced)     │
├──────────────┬─────────────────────────────────┤
│  Q1-report   │  ✓  green check  (on this PC)    │
│  budget.xlsx │  ✓  green check  (on this PC)    │
│  archive\    │  ☁  cloud icon   (online-only)   │ ◄── placeholder
│  photos\     │  ☁  cloud icon   (online-only)   │ ◄── placeholder
└──────────────┴─────────────────────────────────┘
        the cloud-icon files have no bytes on disk

Half the tree had little cloud icons instead of green checks. OneDrive Files On-Demand. Those files looked real — name, size, timestamp — but the actual content lived in the cloud. On disk they were placeholders.

The “aha”

OneDrive’s on-demand placeholders are implemented with the Windows cloud files filter driver, cldflt. When any process opens a placeholder and tries to read it, cldflt intercepts the read and hydrates the file — downloads it from the cloud, transparently, before returning a single byte.

A backup tool is the most naive reader on earth. It opens every file and reads it end to end. So Kopia walks into a folder full of placeholders and triggers a hydration storm: thousands of files getting pulled down on demand, throttled, retried, and in some cases just… stalling. The read call blocks. The thread parks. Forever.

The backup wasn’t broken. It was politely waiting for the entire cloud to fall back onto the disk, one file at a time, through a sync client that had no idea it was under attack.

Kopiaread()cldflthydrate...cloud(stalls)x blocks foreverKopiaVSS shadowstable viewok: no hydration
Reading the live folder triggers cldflt hydration and hangs; reading a VSS shadow skips it entirely.

The fix

You don’t want to read the live folder. You want to read a snapshot of it — a frozen, point-in-time view that the cloud filter doesn’t get to intercept. That’s exactly what a Volume Shadow Copy gives you. The VSS snapshot presents the volume’s state as committed bytes, with no on-demand hydration path.

Kopia has hooks for precisely this: run an action before it walks the source root, and another after. Create the shadow before, tear it down after.

# before-snapshot: create a shadow copy and expose it via symlink
$vol = (vssadmin create shadow /for=C: | Select-String 'Shadow Copy Volume Name').ToString().Split(' ')[-1]
$env:KOPIA_SNAPSHOT_PATH = $vol
cmd /c "mklink /d C:\kopia-shadow $vol\"
# after-snapshot: clean up the link and drop the shadow
cmd /c "rmdir C:\kopia-shadow"
vssadmin delete shadows /for=C: /oldest /quiet

Then point the source at the shadow instead of the live path and wire the hooks in:

kopia snapshot create C:\kopia-shadow\Users\user\Documents \
  --before-snapshot-root-action="powershell -File C:\scripts\vss-before.ps1" \
  --after-snapshot-root-action="powershell -File C:\scripts\vss-after.ps1"

The next run finished in eighteen minutes. Every placeholder read off the shadow returned instantly, because the snapshot view doesn’t route through cldflt. No hydration, no stalls, no nine-hour ghost.

Why it happened

OneDrive Files On-Demand exists so users don’t sync 400GB of vacation photos onto a 256GB laptop. Great feature for humans. Terrible surprise for any tool that assumes “the file is on this disk.”

Kopia did nothing wrong. It read files. The trap is that on a cloud-synced volume, “open and read” is a network operation in disguise, gated by a filter driver that will block your thread while it phones home. Backups that walk thousands of placeholders turn into a self-inflicted denial of service.

Snapshotting through VSS sidesteps the whole mess, and as a bonus you get a consistent point-in-time copy instead of backing up a folder that’s mutating under you.

Takeaways

  • Cloud on-demand files are placeholders, not files. OneDrive, Dropbox Smart Sync, and friends all use a filter driver (cldflt on Windows) that hydrates on read.
  • A naive reader hangs on hydration. Backups, indexers, and AV scanners that open every file will trigger downloads — and stall when the cloud throttles or times out.
  • Back up from a VSS shadow, not the live tree. The snapshot view bypasses the cloud filter and returns committed bytes instantly.
  • Use Kopia’s before/after root-action hooks to create and tear down the shadow around each run — no manual babysitting.
  • A backup at “0% for hours” with no I/O isn’t slow, it’s blocked. Watch the thread state; a parked read on a cloud volume is the tell.