# Storage layout

Object storage holds every byte of user content Scryon ever touches. The layout is provider-agnostic — keys look the same whether the backing store is AWS S3, Cloudflare R2, MinIO, or a local filesystem.

## Key layout

```
users/{userId}/calls/{callId}/
├── temp/                                  # TEMP_AUDIO (ephemeral)
│   └── audio-{originalName}
├── diarization/
│   └── diarization.json                   # DIARIZATION_JSON
├── transcripts/
│   ├── raw.json                            # RAW_TRANSCRIPT_JSON
│   └── normalized.json                     # NORMALIZED_TRANSCRIPT_JSON
└── analysis/
    └── analysis.json                       # ANALYSIS_JSON
```

Key generation lives in `StorageKeys`, a single source of truth.

## Lifecycle

| Artifact                     | Lifetime                                            | Sweep                   |
| ---------------------------- | --------------------------------------------------- | ----------------------- |
| `TEMP_AUDIO`                 | `OBJECT_STORAGE_TEMP_AUDIO_TTL_HOURS` (default 24h) | `StaleTempAudioSweeper` |
| `DIARIZATION_JSON`           | Persistent                                          | —                       |
| `RAW_TRANSCRIPT_JSON`        | Persistent                                          | —                       |
| `NORMALIZED_TRANSCRIPT_JSON` | Persistent                                          | —                       |
| `ANALYSIS_JSON`              | Persistent                                          | —                       |

> Raw audio (`TEMP_AUDIO`) is the only privacy-sensitive blob and is **the only artifact ever deleted by sweep**. Everything else is durable and may be re-read on demand by `/api/calls/{id}/transcript`, `/api/calls/{id}/analysis`, etc.

## Provider abstraction

Implementations sit behind `ObjectStorageService`:

| Bean                            | When                            | Code                                                           |
| ------------------------------- | ------------------------------- | -------------------------------------------------------------- |
| `LocalFileObjectStorageService` | `OBJECT_STORAGE_PROVIDER=local` | Writes under `OBJECT_STORAGE_LOCAL_PATH`.                      |
| `S3ObjectStorageService`        | `OBJECT_STORAGE_PROVIDER=s3`    | Uses the AWS SDK v2; works against any S3-compatible endpoint. |

For S3 the endpoint, region, credentials, and `pathStyleAccess` are configurable. Cloudflare R2, MinIO, Wasabi, and Backblaze B2 all work with `OBJECT_STORAGE_PATH_STYLE_ACCESS=true`.

## Privacy

* **No public keys.** Nothing in the bucket is publicly readable. Clients fetch transcripts and analysis through the REST API, which enforces ownership.
* **Presigned URLs are short-lived.** When pyannoteAI uploads, it uses a presigned PUT URL we generate just-in-time and discard.
* **No phone numbers or names in keys.** Keys are derived from UUIDs only.

## Local dev

When running locally with `OBJECT_STORAGE_PROVIDER=local` the layout under `./var/storage/` is identical to S3, so you can `ls -R var/storage` to inspect what would be stored in production.

```
var/storage/users/449b4cd2-.../calls/f0a1d2e3-.../
├── temp/audio-call.m4a
├── diarization/diarization.json
├── transcripts/raw.json
├── transcripts/normalized.json
└── analysis/analysis.json
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.scryon.app/architecture/storage-layout.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
