Filespooler

What is Filespooler?

  1. Filespooler lets you request the remote execution of programs, including stdin and environment. It can use tools such as S3, Dropbox, Syncthing, NNCP, ssh, UUCP, USB drives, CDs, etc. as transport; basically, a filesystem is the network for Filespooler. Filespooler is particularly suited to distributed and Asynchronous Communication.

  2. Filespooler is a tool in the Unix tradition of “do one thing and do it well.” It is designed to integrate nicely with decoders (to handle compressed or Encrypted packets, for instance). It can send and receive packets by pipes. Its on-disk format is simple and is designed to interface well with other tools.

  3. Filespooler is strictly ordered by default; that is, it executes jobs in the order they were created, even if they arrive out of order. However, it also supports looser operation modes for scenarios such as certain Many-To-One setups.

  4. Filespooler is an example of scalable Small Technology:

    • The file format is lightweight; less than 100 bytes overhead in most cases.
    • The queue format is lightweight; even a Raspberry Pi could easily process thousands of different queues if needed.
    • The main CLI tool, fspl, uses less than 10MB RAM on x86_64
  5. Filespooler processes packets as streams, and can easily accommodate multi-terabyte payloads.

  6. Filespooler is extremely versatile. In addition to the various transports it can easily work with, it can also easily work with encoders/decoders such as compressors and encryption tools. Basically, if you can pipe stuff to or from it, Filespooler can integrate with it. Thanks to its flexible design (it’s the “find of command execution”), Filespooler also supports advanced queue topologies supporting One-To-Many, Many-To-One, Feeding Queues from Other Queues, Parallel Processing, etc. with its simple design – all with very little effort.

Learning about Filespooler

Once installed, learn about using it in different situations.

  • The Using Filespooler over Syncthing is one of the more comprehensive tutorials for Filespooler, and is a great place to start (even if you don’t use Syncthing.)

Installation

The Filespooler Reference discusses this. You can install with Rust with a one-line command, but binaries are also available from the releases page. They are built for these platforms:

These binaries are built using the trusted infrastructure maintained by Debian, using the official Rust docker images, and the build logic contained within the Filespooler repo.

Integrations

Transports

These pages are introductions that explain how to use Filespooler with different transports:

Encoders and Decoders

Programs to Execute

Security

By default, Filespooler packets are un-encrypted and unsigned. But Filespooler is designed to integrate nicely with encryption tools, thanks to the --decoder option. Here are some examples.

Management

Tips and Tricks


Sometimes we want better-than-firewall security for things. For instance:

In my writing about dar, I recently made that point that dar is a filesystem differ and patcher.

I loaded up this title with buzzwords. The basic idea is that IM systems shouldn’t have to only use the Internet. Why not let them be carried across LoRa radios, USB sticks, local Wifi networks, and yes, the Internet? I’ll first discuss how, and then why.

“OK,” you’re probably thinking. “John, you talk a lot about things like Gopher and personal radios, and now you want to talk about building a reliable network out of… USB drives?”

One frustration people sometimes have with ssh or NNCP is that they’d like to pass along a lot of metadata to the receiving end. Both ssh and nncp-exec allow you to pass along command-line parameters, but neither of them permit passing along more than that. What if you have a whole host of data to pass? Maybe a dozen things, some of them optional? It would be very nice if you could pass along the environment.

dar is a Backup and archiving tool. You can think of it as as more modern tar. It supports both streaming and random-access modes, supports correct incrementals (unlike GNU tar’s incremental mode), Encryption, various forms of compression, even integrated rdiff deltas.

GnuPG (also known by its command name, gpg) is a tool primarily for public key Encryption and cryptographic authentication.

All of the Filespooler examples so far have focused on using fspl queue-process to process queue items.

Filespooler provides the fspl queue-write command to easily add files to a queue. However, the design of Filespooler intentionally makes it easy to add files to the queue by some other command. For instance, Using Filespooler over Syncthing has Syncthing do the final write, the nncp-file (but not the nncp-exec) method in Using Filespooler over NNCP had NNCP do it, and so forth.

gitsync-nncp is a tool for using Asynchronous Communication tools such as NNCP or Filespooler, or even (with some more work) Syncthing to synchronize git repositories.

Since Filespooler is an ordered queue processor by default, it normally insists on a tight mapping between the sequence numbers in job files and execution order in a queue.

By default, Filespooler doesn’t do anything special with the output from the commands that fspl queue-process executes. If they write to stdout or stderr, you’ll see this on the controlling terminal or wherever you have piped or redirected it.

Filespooler is designed to work well in automated situations, including when started from cron or systemd. It is a fairly standard program in that way. I’ll discuss a few thoughts here that may help you architect your system.

You can use gitsync-nncp (a tool for Asynchronous syncing of git repositories) atop Filespooler. This page shows how. Please consult the links in this paragraph for background on gitsync-nncp and Filespooler.

Filespooler makes an excellent tool for handling Backups. In fact, this was the case the prompted me to write it in the first place.

You’ll notice that Filespooler’s fspl queue-process command takes a single command. What if you want to permit the sender to select any of several commands to run?

Filespooler has a powerful concept called a decoder. A decoder is a special command that any Filespooler command that reads a queue needs to use to decode the files within the queue. This concept is a generic one that can support compression, encryption, cryptographic authentication, and so forth.

The reference documentation for Filespooler is here:

It seems that lately I’ve written several shell implementations of a simple queue that enforces ordered execution of jobs that may arrive out of order. After writing this for the nth time in bash, I decided it was time to do it properly. But first, a word on the why of it all.

In some cases, you may want to use Filespooler to send the data from one machine to many others. An example of this could be using gitsync-nncp over Filespooler where you would like to propagate the changes to many computers.

Sometimes with Filespooler, you may wish for your queue processing to effectively re-queue your jobs into other queues. Examples may be:

Filespooler is designed around careful sequential processing of jobs. It doesn’t have native support for parallel processing; those tasks may be best left to the queue managers that specialize in them. However, there are some strategies you can consider to achieve something of this effect even in Filespooler.

Sometimes, one wants to verify the integrity and authenticity of a Filespooler job file before processing it.

Like the process described in Encrypting Filespooler Jobs with GPG, Filespooler can handle packets Encrypted with Age (Encryption). Age may be easier than GnuPG in a number of cases, particularly because it can use a person’s existing SSH keypairs for encryption.

Thanks to Filespooler’s support for decoders, data for filespooler can be Encrypted at rest and only decrypted when Filespooler needs to scan or process a queue.

NNCP is a powerful tool for building Asynchronous Communication networks. It features end-to-end Encryption as well as all sorts of other features; see my NNCP Concepts page for some more ideas.

Filespooler is a way to execute commands in strict order on a remote machine, and its communication method is by files. This is a perfect mix for Syncthing (and others, but this page is about Filespooler and Syncthing).

Syncthing is a serverless, peer-to-peer file synchronization tool. It is often compared to Dropbox. However, unlike Dropbox, there is no central server with Syncthing; your devices talk directly to each other to sync data. Syncthing has various effective methods for firewall traversal, including public relays for the worst case. All Syncthing traffic is fully encrypted and authenticated.

Old technology is any tech that’s, well… old.

This page gives you references to software by John Goerzen.

Inspired by several others (such as Alex Schroeder’s post and Szczeżuja’s prompt), as well as a desire to get this down for my kids, I figure it’s time to write a bit about living through the PC and Internet revolution where I did: outside a tiny town in rural Kansas. And, as I’ve been back in that same area for the past 15 years, I reflect some on the challenges that continue to play out.

Here is a comparison of various data backup and archiving tools. For background, see my blog post in which I discuss the difference between backup and archiving. In a nutshell, backups are designed to recover from a disaster that you can fairly rapidly detect. Archives are designed to survive for many years, protecting against disaster not only impacting the original equipment but also the original person that created them. That blog post goes into a lot of detail on what makes a good backup or archiving tool.