A JSON File as a Task Queue: When Simplicity Beats RabbitMQ

When you need a task queue, the standard advice is to reach for a message broker.

RabbitMQ. Redis queues. Amazon SQS. Celery with a broker backend.

That advice is correct when you have distributed workers, network boundaries, and throughput requirements that demand a purpose-built system.

It is overkill when you have one machine, two workers, and fifty-two tasks.

The Situation

Karl, my AI assistant, manages a rotating set of tasks across several projects. TTS pipeline runs, article drafting, system monitoring, memory maintenance, code reviews. At any given time, the task queue holds around fifty-two items.

The requirements are modest:

  • Order tasks by priority. Some work matters today. Some can wait a week.
  • Detect stale tasks. A task that has been in progress for four hours is probably stuck.
  • Limit concurrency. Two workers maximum. The GPU cannot handle more than that.
  • Survive restarts. If Karl restarts, the queue needs to be right where it was.

That is the entire requirements list. No network distribution. No multi-tenant isolation. No need for a dedicated queue server consuming 200MB of RAM and requiring its own monitoring dashboard.

The File

The queue is a JSON file called TASK_QUEUE.json. It sits in the workspace directory alongside everything else.

Each task is an object with a handful of fields: an ID, a priority number, a status (pending, in_progress, completed), a title, a created timestamp, and a started timestamp. That is enough to do everything the requirements demand.

Priority ordering is just sorting by the priority field. Stale detection is comparing the started timestamp to the current time. Concurrency limiting is counting how many tasks have status in_progress and refusing to claim more if the count is at two. Persistence is free because it is a file on disk.

The whole implementation is roughly two hundred lines of code including comments.

Why Not a Real Queue?

I asked myself this question several times. The instinct to use proper infrastructure is strong.

But every time I mapped the requirements against what a message broker provides, the broker was solving problems I did not have.

Network distribution? All workers run on the same machine.

High throughput? Tasks take minutes or hours. The queue processes maybe twenty items per day.

Fault-tolerant delivery? A JSON file on an SSD is durable enough for this workload.

Message ordering and priority? JSON supports sorting. That is a solved problem in any language.

The one thing a message broker gives you that a JSON file does not is atomicity under concurrent access. If two workers try to claim the same task simultaneously, you need some form of locking. But with a maximum of two workers and a heartbeat that runs every five minutes, the window for a race condition is small. A simple file lock during writes handles it.

The right tool for the job is sometimes the tool you already have.

The Stale Detection Pattern

The most useful feature of this setup is the stale detection.

Every task records when it was started. If a task has been in_progress for more than four hours, it gets flagged as stale and reset to pending. The next worker that becomes available picks it up.

This catches the silent failure problem I wrote about in my recovery system post. A worker crashes. The task never gets marked as completed. Without stale detection, that task sits in_progress forever, invisible to the system.

Four hours is the threshold because that is longer than any legitimate task should take, but short enough that a stuck task gets retried within the same working day. The specific number is arbitrary. The pattern is not.

Any task queue without stale detection will eventually lie to you about what is in progress.

The General Principle

The principle here is not "JSON files are better than message brokers." They are not. If you are running a distributed system with hundreds of workers across multiple machines, use a real queue. Use RabbitMQ. Use SQS. Use Kafka if your problem genuinely demands it.

The principle is: match your infrastructure to your actual scale, not your imagined scale.

A lot of engineering teams build for a future that never arrives. They add Redis queues to projects that will never have more than one worker. They introduce Kubernetes for a single-container application. They build microservices for a team of three.

Each piece of infrastructure you add has a cost. Not just in resources, but in complexity, debugging surface area, and things that can break at 3 AM. A JSON file does not crash. A JSON file does not need a connection pool. A JSON file does not have a configuration file of its own that you need to version and maintain.

When the JSON file stops being sufficient, you will know. The queue will get too large to scan efficiently. Concurrent access will become a real problem. Tasks will need to span machines. At that point, migrate to a broker.

But by then you will know exactly what you need from the broker, because your JSON file taught you the requirements.

What This Looks Like in Practice

In practice, the queue runs quietly. Karl checks it during heartbeat polls, picks up the highest-priority pending task, works on it, marks it complete. If a task stalls, the stale detector catches it within four hours. If Karl restarts, the queue is intact because it is just a file.

The queue has processed hundreds of tasks across TTS pipeline runs, article drafts, code reviews, and maintenance work. It has not lost a single one.

That reliability did not come from sophisticated infrastructure. It came from a simple format, a stale detection threshold, and the discipline to write state changes before acting on them.

Fifty-two tasks. Two workers. One file. No broker required.