Mutexes in Linux shells
Linux shells do not have built-in mutexes. However, it is possible to simulate them using the following approaches:
1. Implementation With mkdir
The mkdir command in Linux is atomic on POSIX-compliant local file systems.
This means that creation of a new directory either completes successfully or fails, and there is no intermediate state.
If two or more processes attempt to create the same directory simultaneously, only one will succeed, and the rest will receive an error.
This makes the mkdir command a suitable and portable way for creating locks.
Before entering the critical section, the process checks if the lock directory exists. If the directory does not exist, then it means no other process is in the critical section or accessing the protected resource.
The process can then attempt to create the lock directory. If the directory is created successfully, then the lock is obtained and the process can enter the critical section. If there is an error creating the directory, then another process must be entering or has entered the critical section. The process can then choose to wait or abort execution.
Once the process is done, it can delete the lock directory to release the lock. A trap can also be set to clean up the directory in case the script exits cleanly.
LOCK_DIR="/tmp/lock_file"
if ! mkdir "$LOCK_DIR" 2>/dev/null; then # resource is already locked # : exit or wait for it to be unlocked echo "Another processes has the lock." exit 1fi
trap 'rmdir "$LOCK_DIR"' EXIT# lock acquired on resource# enter your critical section here...mkdir caveats
-
The biggest problem for using the mkdir command is
stale locks.
Stale locks result from the lock directory not being successfully deleted.
This can happen in the following situations:- When the script is killed by a
SIGKILLsignal, signal 9, before the lock directory is cleaned up. This signal cannot be trapped by a shell script. - When the script’s execution is abruptly stopped due to a system crash or a power loss.
When a stale lock results, subsequent executions may fail to create the lock directory.
- When the script is killed by a
-
On Network file systems, the atomicity of the
mkdircommand cannot be guaranteed due to issues such as caching. -
Use of the
-pflag changes the behaviour of the command.
Themkdirapproach relies on a non-zero exit code if the lock directory exists and a zero exit code if it is created successfully. With the-pflag, if the lock directory exists, a non-zero exit code is not returned breaking the mutex implementation. -
This approach also lacks lock management features such as
lock timeouts,lock information, andlock cleanup.
To use this approach, these features have to be manually implemented.
2. Implementation With flock
flock is a Linux utility used to manage advisory locks on open files from within a shell script or the commandline.
The utility helps to achieve synchronized access to a file protecting its contents from accidental corruption.
It relies on the underlying flock() system call, which associates the lock with the file descriptor a process holds for an open file.
The lock is an advisory lock and thus it is not enforced by the kernel.
This means that a process can still have access to the file even if it is locked.
All processes that access your resource must use flock and respect the lock for it to work.
It can be used to obtain shared or exclusive access to a file, wait on a lock on a file, or release a lock on a file. The default behavior of the utility is an exclusive lock. Any locks not released are automatically released when the process ends.
NOTEThe lock file can be either a separate lock file or the actual data file to be protected by the mutex.
A personal recommendation is to use the data file as the lock file unless your requirements need otherwise, such as when protecting a directory. This way, you have fewer file system footprints and also simplify your script logic.
flock flags
| flag | description |
|---|---|
| -w seconds | If the lock is not immediately available, wait for x seconds for it to be available. If x seconds elapse and it’s not available, flock fails and exits. |
| -s | Obtain a shared lock used to read the file contents safely. Multiple processes can obtain a shared lock on the same file. |
| -x | Obtain an exclusive lock used to modify the file contents safely. Prevents any other process from acquiring a shared or exclusive lock. |
| -u | Removes an existing lock held by the current process. |
| -n | If the lock cannot be acquired, flock fails immediately without blocking. |
| -o | Close the file descriptor on which the lock is held. Useful when a process may fork a child process and the file descriptor and lock should not be inherited. |
| -c cmd | Execute the command string cmd with the lock held. |
Simple flock usage
# Exclusivley locks the file data.txt for writingflock data.txt -c ' DAT=$(cat data.txt); echo $((DAT+1)) > data.txt; cat data.txt'
# Obtain a shared lock on data.txt for readingflock -s data.txt -c 'cat data.txt'flock with file descriptors
A file descriptor is usually a non-negative integer used to uniquely identify a system resource in the context of a running process. The system resource can be a file, directory, pipe, device, or network socket.
You may already be familiar with the 3 default file descriptors that a process starts with:
| FD No. | Abbreviation | Name | Default resource |
|---|---|---|---|
| 0 | stdin | Standard Input | Keyboard |
| 1 | stdout | Standard Output | Console Screen |
| 2 | stderr | Standard Error | Console Screen |
flock can be used with file descriptors as well.
# In this example the data file is also the lock file
# open the data file and assign it the file descriptor number 300DATA_FILE='data.txt'exec 300 > "$DATA_FILE"
# obtain an exclusive lock on the file for writing. We use -n to overide default blocking behaviourif ! flock -n 300; then echo "The DATA FILE is already locked. Exiting." exit 1fi
# The exclusive lock was obtained so we can enter critical sectionecho "In critical section"
# ---------------------# if the script terminates, the file descriptor will be closed and the lock released automatically.# ---------------------
# if the script continues with further processing after the critical section, you can manually unlock the file or close the FD as follows:
### This only releases the lock on the file. The file descriptor is still open and can be used later.flock -u 300
### This closes the file descriptor in which case the lock is automatically released.exec 300 > &-Example Use for Mutexes
We’ll create a simple script to demonstrate mutexes in use in Linux.
Scenario
Say we have a file X containing a list of directories. We also have a worker script that processes the directories in our file X. The worker script picks a directory in the file X and marks it as pending, P. When done processing it, it marks the directory as complete, C. We can assume it takes 10s to process a single directory.
Solution
If we have 10 directories in the file X, it will take 100s for all directories to be processed.
We can launch multiple instances of our worker script to speed this up. This will, however, introduce a concurrency problem. Multiple workers might pick the same unprocessed directory before it is marked as pending, P. This will potentially corrupt the data in our directory. This is where the mutex comes in.
We can lock the file X exclusively when picking a directory and unlock it once we’ve updated the status. This will serialize reads and writes on our file X, ensuring our workers are synchronized.
solution implementation
/data/dir_0/data/dir_1/data/dir_2/data/dir_3/data/dir_4#!/bin/bash
DATA_FILE="dirs.txt"DATA_FILE_FD=200
# open data file and assign a file descriptoreval "exec $DATA_FILE_FD<>\"$DATA_FILE\""
# function to attempt to the lock data file 3 timesfunction Lock_Data_File { local max_attempts=3 local attempt=1
while [ $attempt -le $max_attempts ]; do # wait for the lock to be available for 2 seconds if flock -w 2 $DATA_FILE_FD; then return 0 fi ((attempt++)) done
echo "Failed to lock data file after 3 attempts..!" return 1}
# get initial directory to processif ! Lock_Data_File; then exit 1fi
LINE_INDEX_TEXT=$(awk '!/.+:.+/ {printf "%d:%s\n", NR, $0; exit}' "$DATA_FILE")
while true; do # if no more data to process, exit, this will release the lock and file decriptor as well if [ -z "$LINE_INDEX_TEXT" ]; then echo "No more directories to process" exit 0 fi
LINE_INDEX=`echo "$LINE_INDEX_TEXT" | cut -d: -f1` LINE_TEXT=`echo "$LINE_INDEX_TEXT" | cut -d: -f2`
# mark the dir as processing and unlock data file sed -i "${LINE_INDEX}s/^/P:/" "$DATA_FILE"
flock -u $DATA_FILE_FD
# process the dir, sleep for 10s to simulate long process echo Processing $LINE_TEXT sleep 10 echo Done Processing $LINE_TEXT
# lock the data file, mark the dir as processed and get next directory to process if ! Lock_Data_File; then exit 1 fi
sed -i "${LINE_INDEX}s/^P:/C:/" "$DATA_FILE"
LINE_INDEX_TEXT=$(awk '!/.+:.+/ {printf "%d:%s\n", NR, $0; exit}' "$DATA_FILE")doneWARNINGThis is a sample script you can use to reference or base your worker logic on and not a final script. There are unhandled errors, such as updating the data file, which you may have to address as well.
Summary
- Linux shells don’t provide built-in mutex implementations.
- It is possible to simulate mutex behavior using file operations, specifically the
mkdirandflockcommands. - The
mkdircommand is limited in portability and also may not work on all file systems. - The
flockcommand is more robust and portable. - The
flockcommand and its associatedflocksystem call use advisory locks. These are not enforced by the kernel. - With the
flockcommand, the worker processes must respect the advisory locks for the mutex behavior to work.
Conclusion
Unless you have a good reason, prefer the flock command for mutex implementations in Linux shell scripts.
This is more portable than using the mkdir command.