Designing a Rust Workspace for Production AI: When to Split and When to Keep It Simple
My narrator-tts project started as a single crate. One Cargo.toml, one src/ directory, everything in one place. It worked fine for the early days — write some code, cargo run, generate some audio, see what happens.
Then the project grew. I needed benchmarks that could run independently of the main binary. I needed a GUI that didn't link against the entire TTS backend. I needed test utilities that could be shared across both. The single-crate structure started to feel like everything was tangled into everything else.
This is the story of how I split narrator-tts into a four-member workspace, what I got right, and what I wish I had known before I started.
The Workspace Members
The final structure ended up as:
- qbench — the core library. Audio processing, TTS model abstraction, QA scoring, benchmark harness. No GUI, no CLI, no async runtime. Pure library code with all the domain logic.
- narrator-tts — the CLI binary. Depends on qbench. Handles argument parsing, file I/O, progress reporting, manifest management. This is what you run from the terminal.
- narrator-gui — the desktop GUI. Does NOT link qbench as a library. Instead, it launches narrator-tts as a subprocess and communicates via the manifest files on disk.
- narrator-slideshow — a separate tool for generating slideshow audio from a different input format. Shares qbench for the audio generation parts.
The key insight is the relationship between the CLI and the GUI. They don't share code at compile time. They share data at runtime.
Why the GUI Spawns the CLI Instead of Linking the Library
This was the most counterintuitive decision, and it was also the best one.
The obvious approach is to have both the CLI and GUI depend on the qbench library directly. Link it in both, call the same functions, share the same types. This is how most Rust GUI applications work — the binary and the GUI both depend on a shared core crate.
I tried this. It worked. Then it became a nightmare.
Feature flags don't compose across binaries. The CLI needs multi-worker and backend-tch features. The GUI doesn't need any of that — it just launches a process. But when both binaries depend on the same library, Cargo compiles one set of features, and you end up with the GUI dragging in CUDA dependencies it doesn't need. The GUI binary was 200MB because it was statically linking libtorch. That is absurd for something that shows a progress bar.
CUDA initialization doesn't play nice with GUI frameworks. The GUI uses iced (Rust's immediate-mode GUI library). iced wants to own the main thread. The TTS backend wants to initialize a CUDA context. When both happen in the same process, you get threading conflicts, initialization order issues, and crashes that only happen on Tuesdays. Separating them into different processes eliminated an entire class of bugs.
Crashes in one don't kill the other. When the TTS pipeline hits a CUDA out-of-memory error, the CLI process crashes. If the GUI was linked to the same library, the GUI crashes too. With process isolation, the CLI crashes, the manifest file records where it stopped, and the GUI detects the exit code and shows a "restart from checkpoint" button. The user never loses their place.
The manifest file is the contract between the CLI and the GUI. It's a JSON file on disk that both processes read and write. The CLI writes progress as it generates audio. The GUI polls the manifest for updates. Neither process knows or cares that the other one exists. They just agree on a file format.
The qbench Core: Keep It Pure
The qbench library has a strict rule: no side effects that touch the outside world.
It can process audio samples. It can score quality metrics. It can run benchmark loops against a model interface. But it doesn't read files, it doesn't write files, and it doesn't spawn processes. All of that lives in the binaries that depend on it.
This separation paid off in testing. Qbench has over 500 tests that run in milliseconds because they don't need a GPU, don't need audio files, and don't need a filesystem. They test the logic — chunking strategies, QA scoring thresholds, silence detection — in isolation.
When I added the backend-tch feature for actual CUDA inference, I could gate all GPU-dependent tests behind a feature flag. The CI suite runs the full test suite in seconds. The GPU tests run locally when I need them. Nothing in between.
A library crate should not know how it's being used. If your core library is reaching for
std::fs::read, ask yourself whether that I/O should live in the binary that calls it instead.
What I Got Wrong
The workspace structure isn't perfect. Here are the things I'd do differently.
I split too early. The single-crate structure was fine for the first three months. Splitting into a workspace added compile time overhead, made dependency management more complex, and introduced workspace-level Cargo.toml coordination that didn't exist before. I should have waited until the pain of the single crate was real and daily, not theoretical.
Shared types across workspace members are awkward. The manifest format — the JSON contract between CLI and GUI — lives in qbench because both sides need to read it. But the manifest schema also includes fields that only the CLI cares about (restart commands, retry counts). The types are either in the core library where the GUI doesn't need them, or duplicated. I ended up putting them in qbench and accepting the minor pollution.
Benchmarks should have been a separate member from the start. The benchmark harness ended up as a module inside qbench, which means running benchmarks requires compiling the entire library. A separate qbench-bench crate that depends on qbench as a dev-dependency would have been cleaner. The benchmark code and the library code have different optimization profiles — benchmarks want release mode with debug assertions, the library wants strict debug mode for tests.
When to Use a Workspace
Not every Rust project needs a workspace. The overhead of coordinating multiple Cargo.toml files, managing feature gates across members, and keeping workspace-level dependencies in sync is real.
You probably need a workspace when:
- You have multiple binaries with different dependency profiles. A CLI that needs CUDA and a GUI that doesn't is the classic case.
- Tests need to run without heavy dependencies. If your test suite takes 30 seconds because it's compiling libtorch, split the core logic into a library crate.
- You need to publish some crates but not others. The qbench library might be useful to others. The GUI is not.
You probably don't need a workspace when:
- You have one binary and one library. Just keep them in the same crate with
[[bin]]and[lib]sections. - The project is under 10,000 lines. The overhead of workspace management outweighs the benefits.
- All your binaries share the same dependency profile. If everyone needs the same features, there's no compile-time savings from splitting.
The Lesson
Good architecture is about drawing boundaries in the right places. The workspace boundary in narrator-tts works because it follows real dependency lines — the GUI genuinely doesn't need CUDA, the core genuinely doesn't need file I/O, and the benchmarks genuinely don't need either.
When boundaries are drawn at the right places, the structure disappears. You stop thinking about workspace coordination and start thinking about the code. That's when you know the architecture is working.