Back in my 2019 article “The Desktop Security Nightmare”, I noted that on most of our desktops, we don’t have good control of what data a program can access and when.
I noted that we have things like AppArmor, which is something, but not the entire picture. SELinux is so extremely complicated that even Ted T’so had a comment about never getting some of his life back.
I don’t like complexity, especially when it comes to security.
One of my goals is what I’m going to call context-sensitive security. For instance, I would like the PDF of my taxes to be unavailable to all software… except when I’m working on my taxes. So, the okular PDF viewer shouldn’t be able to access my tax files, except when I explicitly say it’s OK.
One way to accomplish this, of course, would be to just mount the filesystem containing my taxes when I’m working on taxes, and leave it unmounted. However, besides the obvious convenience drawback, this has another one: either the files are inaccessible entirely, or they’re accessible to the 5000 programs I have in /usr/bin, the untold number of npm packages a person may have installed, and so forth.
What I really want is to be able to say: “make this directory tree available only to this process and its children.” And that’s what I’m going to lay out in this article.
Background on Linux namespaces
You are probably already familiar with containers in the sense that they’re behind Docker and LXC. A container uses a bunch of Linux namespaces to give the illusion of a separate machine. The namespace types include cgroup, IPC, network, mount, PID, time, user, and UTS. So if, for instance, a process has a separate PID namespace, then the process IDs within it may not show the entire system’s PID table, may map to other “real” PIDs, etc. Likewise, with a distinct mount namespace, it may have different filesystems mounted.
The trick I’m going to use here is this: you don’t have to use all of these as separate namespaces. You can just use a couple, and achieve some nice separation without having a fully-isolated container! And it can be done entirely without root permissions.
A first demonstration
For this demonstration, I’m going to use gocryptfs. It is an encrypted filesystem in FUSE, which means no root is necessary. You could use anything, though, from a traditional filesystem to other FUSEs, or even bind mounts.
I should note, however, that the in-kernel keyring (used by fscrypt and e4crypt) is not separated out by namespaces, so you can’t just unlock a certain tree with e4crypt and expect it to be only unlocked in one namespace.
First, we’re going to enter a different namespace. The unshare
command will create a separate user namespace (-U
, necessary for the mount namespace), a separate mount namespace with -m
, and populate the user namespace with our current user with -c
. Since I don’t give it an explicit command to run, it will run a shell. Here we go:
$ echo $$
873411
$ unshare --keep-caps -Umc
$ echo $$
887896
So you can see we’re in a different PID, at least. Now let’s set up gocryptfs:
$ mkdir crypt plain
$ gocryptfs -init crypt
Choose a password for protecting your files.
Password:
Repeat:
...
The gocryptfs filesystem has been created successfully.
You can now mount it using: gocryptfs crypt MOUNTPOINT
OK. We’ve made two directories, crypt
which holds the encrypted data, and plain
which holds the plaintext (decrypted) view. We also initialized crypt
. Now let’s mount it – remember, we’re still in the new namespace:
$ gocryptfs crypt plain
Password:
Decrypting master key
Filesystem mounted and ready.
OK! Now how about creating a file in plain:
$ echo Testing > plain/test
Now, we can see that there’s an encrypted file representing it in crypt:
$ ls -l crypt
total 6
-rw-r--r-- 1 jgoerzen jgoerzen 58 Dec 10 06:24 C1kX7S1Lq423tp7QVwdNfA
-r-------- 1 jgoerzen jgoerzen 385 Dec 10 06:24 gocryptfs.conf
-r--r----- 1 jgoerzen jgoerzen 16 Dec 10 06:24 gocryptfs.diriv
And in plain, we have the file:
$ ls -l plain
total 1
-rw-r--r-- 1 jgoerzen jgoerzen 8 Dec 10 06:24 test
$ cat plain/test
Testing
Now, keep this terminal open. Open another one (but not by starting it from this shell). From the other terminal, you can see:
$ ls -l plain
total 0
Yes! The plain directory was completely empty here, because it was mounted only in the other namespace!
Now, back in the namespace, let’s clean up:
$ fusermount -u plain
$ exit
It’s important to unmount plain before exiting the namespace. If you don’t, you can’t directly umount it from the parent namespace. You would have to either kill the gocryptfs process.
A simple script
Let’s create a script called nsrun
to make this easier.
#!/bin/bash
# Pass the command to run in the namespace,
# and any parameters, on the command-line.
if [ -z "$1" ]; then
echo "Syntax: $0 command [args]"
exit 5
fi
gocryptfs crypt plain || exit "$?"
"$@"
RETVAL="$?"
fusermount -u plain
exit "$RETVAL"
Now, run it:
$ chmod a+x nsrun
$ unshare --keep-caps -Umc ./nsrun ls -l plain
Password:
Decrypting master key
Filesystem mounted and ready.
total 1
-rw-r--r-- 1 jgoerzen jgoerzen 8 Dec 10 06:24 test
Excellent! And our script make sure to unmount the plaintext view before exiting. So now, I could type unshare --keep-caps -Umc ./nsrun okular plain/taxes.pdf
or something to view a file that’s otherwise unavailable - and it will be only available to the okular process started this way (and any of its child processes)! No other process on the system can see it.
Simultaneous access
What if we want to run multiple programs to have access to the data? Note that most filesystems, including gocryptfs, don’t really like to have the same data mounted multiple times. There are a couple of options.
-
We could run something like
unshare --keep-caps -Umc ./nsrun bash
and launch them all from that shell. -
Or, we can simultaneously enter the same namespace multiple times.
#!/bin/bash
# Pass the command to run in the namespace,
# and any parameters, on the command-line.
if [ -z "$1" ]; then
echo "Syntax: $0 command [args]"
exit 5
fi
IDENTIFIER="BLOGDEMO"
until TARGETPID=`pgrep -u "$(id -u)" -n -f "^/usr/bin/gocryptfs.* -fsname $IDENTIFIER "`; do
echo "$IDENTIFIER not mounted; mounting."
unshare --keep-caps -Umc /usr/bin/gocryptfs -fsname "$IDENTIFIER" crypt plain
done
echo "Entering namespace at PID $TARGETPID"
# gocryptfs likes to see at least one read before it permits writes, so do that here.
nsenter --preserve-credentials -U -m -t "$TARGETPID" ls "$(pwd)/plain" > /dev/null
exec nsenter --preserve-credentials -U -m -t "$TARGETPID" /usr/bin/env "--chdir=$(pwd)" "$@"
So this is working a bit differently. It’s going to first mount the filesystem in its own namespace, then just let it hang there.
Then, we figure out the PID of the gocryptfs command, using a (presumably-unique) identifier to differentiate it from other potential gocryptfs instances. Now, by using nsenter
, we can launch a new command in the namespace we created earlier, which is the only way we can access the files.
In this case, we keep reusing the existing mount until we’re done with it. Note that it will be necessary to kill the gocryptfs process in the end when we’re done, since nothing here is going to unmount it.
Watch how it works:
$ ./nsrunenter bash
BLOGDEMO not mounted; mounting.
Password:
Decrypting master key
Filesystem mounted and ready.
Entering namespace at PID 919929
$ cat plain/test
Testing
$ exit
exit
$ cat plain/test
cat: plain/test: No such file or directory
$ ./nsrunenter bash
Entering namespace at PID 919929
$ cat plain/test
Testing
$ exit
exit
$ cat plain/test
cat: plain/test: No such file or directory
So here, the first time we called our new script, it mounted the gocryptfs filesystem, and then ran bash
inside the namespace we created for it. After exiting from that namespace, of course we couldn’t see our test file.
The second time we called the script, it detected the existing namespace and joined it. Again, the command worked.
A word on security
You might be thinking, “well, if I can just nsenter the namespace, what good is this?” One of the principles of Computer Security is defense in depth; that is, multiple lines of defenses.
The premise of this whole post is to add protections in case malicious code is executed in your account. That is, one of your lines of defenses has already failed. Here’s what we’re adding:
- Protection of data at rest via encryption. Or, if the underlying filesystem was already encrypted, a second key is introduced such that an attacker would have to know both to decrypt the data at rest.
- Having the relevant data only mounted at times when it’s needed.
- Keeping it invisible from other processes, unless those processes specifically know about the scheme in use and which process to nsenter.
You could bolster this further by running the unshare
and nsenter
under sudo
, so that the local user wouldn’t be able to enter the namespace without authenticating. This has some tradeoffs (greater complexity for sure), and raises the bar towards an attacker having to fool the user into authenticating to sudo.
So, while this approach isn’t absolutely perfect, it is another line in the other defenses you should already have.
More things you can do
- You can set up the namespaces with a different user (though note that just
sudo unshare
will expose all of root’s files - probably not what you want! Do this carefully!) - You can have a graphical password prompt, for instance with
gocryptfs -extpass ssh-askpass
Links to this note
Here are some (potentially) interesting topics you can find here: