Your Encrypted Backups Are Slow Because Encryption Isn't the Bottleneck
If you encrypt files before pushing them to backup storage, you've probably assumed the encryption step is what makes it slow. That's what I assumed too. Then I looked at the numbers. On any modern x86 chip with AES-NI, AES-256-GCM runs at 4-8 GB/s on a single core. ChaCha20-Poly1305 isn't far behind. The CPU is not the problem. The problem is that your encryption tool reads a chunk of data, encrypts it, writes it out, then reads the next chunk. It's serial. The disk sits idle while the CPU works, and the CPU sits idle while the disk works.
One person decided to fix that by applying the same async I/O technique that powers modern databases to file encryption. The result hits GB/s throughput on commodity NVMe hardware, and the whole thing is about 900 lines of Rust.
What Is Concryptor?
Concryptor is a multi-threaded AEAD file encryption CLI built by FrogSnot. It encrypts and decrypts files using AES-256-GCM or ChaCha20-Poly1305 with Argon2id key derivation, and it does it fast by overlapping disk I/O with CPU crypto using Linux's io_uring interface. It handles single files and directories (packed via tar), runs entirely in the terminal, and installs with cargo install concryptor.
73 stars. One month of focused development. A six-file core with 67 tests. It deserves more.
The Snapshot
| Project | Concryptor |
| Stars | 73 |
| Maintainer | Solo (FrogSnot) |
| Code health | Clean architecture, 67 tests, clippy and fmt now enforced in CI |
| Docs | Excellent README with honest perf analysis and full format spec |
| Contributor UX | Fresh templates and CI, small codebase, easy to navigate |
| Worth using | Not yet for production (author's own disclaimer), but the architecture is real |
Under the Hood
The centerpiece is a triple-buffered io_uring pipeline in engine.rs. The idea is simple: keep three sets of buffers rotating through three stages. While buffer A's encrypted contents are being written to disk by the kernel, buffer B is being encrypted in parallel by Rayon worker threads, and buffer C's plaintext is being read from disk. Every component stays busy. Nothing waits.
The implementation is tighter than you'd expect from a month-old project. Each io_uring submission queue entry carries bit-packed metadata in its user_data field: the low two bits identify which buffer slot, bit 2 flags read vs. write, and the upper bits store the expected byte count for short-I/O detection. When completion queue entries come back, the pipeline routes them to per-slot counters without any hash lookups or allocations. The whole loop runs num_batches + 2 iterations to let the pipeline drain cleanly at the end.
The file format is designed around O_DIRECT. Every encrypted chunk is padded to a 4 KiB boundary. The header is exactly 4096 bytes (52 bytes of data plus KDF parameters plus zero padding). Buffers are allocated with explicit 4096-byte alignment via std::alloc. This lets Concryptor bypass the kernel's page cache entirely, talking directly to NVMe storage via DMA. It's the same technique databases use to avoid double-buffering, and it's a big part of why the throughput numbers are real.
The security model is more careful than I expected from a solo hobby project. The full 4 KiB header is included as associated data in every chunk's AEAD tag, so modifying any header byte invalidates all chunks. There's a TLS 1.3-style nonce derivation scheme where each chunk's nonce is the base nonce XOR'd with the chunk index, preventing nonce reuse without coordination. A final-chunk flag in the AAD prevents truncation and append attacks. The 4032 reserved bytes in the header are authenticated too, so you can't smuggle data into them. The test suite covers chunk swapping, truncation (two variants), header field manipulation, reserved byte tampering, KDF parameter tampering, and cipher mismatch. These aren't afterthought tests. Someone thought about the threat model.
What's rough? The project is Linux-only. io_uring doesn't exist on macOS or Windows, and there's no fallback backend. If you try to build it on a Mac you'll get errors that don't explain why. The README is upfront about the experimental status, which is honest and appreciated, but it does mean you shouldn't point this at anything you can't afford to lose yet. The rand dependency is still on 0.8 (0.10 is current), and until recently clippy warnings and formatting drift had been accumulating unchecked. None of these are architectural problems. They're the kind of rough edges you get when one person is focused on making the core work first.
The Contribution
CONTRIBUTING.md asks you to run clippy and cargo fmt before submitting, but CI only ran cargo test. No enforcement. The result was predictable: 7 clippy warnings had accumulated across engine.rs and header.rs, and formatting had drifted in almost every source file.
I fixed all seven lints. Three were manual div_ceil reimplementations (the (a + b - 1) / b pattern that Rust now has a method for), one was a min/max chain that should have been .clamp(), one was a manual range check, and two were too_many_arguments warnings on internal pipeline functions where every parameter is essential and restructuring would just add noise. I also wired up KdfParams::DEFAULT via struct update syntax to eliminate a dead-code warning, ran cargo fmt --all, and added clippy and fmt checks to the CI workflow so they stay clean going forward.
Getting into the codebase was straightforward. Six files, clear responsibilities: engine.rs handles the pipeline, crypto.rs handles primitives, header.rs handles the format, archive.rs handles tar packing. The code is dense but not clever. You can follow the pipeline loop without needing to hold too much in your head at once. I had the PR ready in under an hour.
PR #10 is open as of this writing.
The Verdict
Concryptor is for people who encrypt files regularly and want it to be fast. If you're backing up to cloud storage, encrypting disk images, or just moving sensitive data between machines, the throughput difference between a serial encryption tool and a pipelined one is real. On NVMe, it's the difference between saturating your drive and leaving most of its bandwidth on the table.
The project is early. One maintainer, one month old, Linux-only, self-labeled experimental. It could stall. But the commit history tells a story of deliberate progression: the initial mmap approach was replaced with io_uring in the same day, security hardening followed within a week, the format was upgraded to v4 with full header authentication, and directory support landed before the first month was out. That's not hobby-project pacing. That's someone building something they intend to use.
What would push Concryptor to the next level? A fallback I/O backend for macOS and Windows would be the single biggest improvement. Even a plain pread/pwrite loop, slower than io_uring but functional, would open the project to most Rust developers who want to try it. Stdin/stdout streaming for pipe composability would help too. And the rand 0.8 to 0.10 migration is a real breaking change that Dependabot can't auto-fix. That's a contribution waiting to happen.
Go Look At This
If you care about I/O performance, encryption, or io_uring, Concryptor is worth reading. The codebase is small enough to understand in an afternoon, and the pipeline implementation is one of the cleaner io_uring examples I've seen in the wild.
Star the repo. Try encrypting a large file and watch the throughput. If you want to contribute, the rand 0.8 to 0.10 migration is sitting there waiting for someone to pick it up.
This is Review Bomb #11, a series where I find under-the-radar projects on GitHub, read the code, contribute something, and write it up. If you know a project that deserves more eyeballs, drop it in the comments.