Ubuntu – Somehow I created a nondeterministic sh script

bashcommand linescriptssedtext processing

I created the following script:

cat > Top10 <<EOF
Linux Mint 17.2
Ubuntu 15.10
Debian GNU/Linux 8.2
Mageria 5
Fedora 23
openSUSE Leap 42.1
Arch Linux
CentOS 7.2-1511
PCLinuxOS 2014.12
Slackware Linux 14.1
sed -ri "s/^[^0-9]*$//" Top10
sed -r "s/(.*)([[:space:]][[:digit:]]*.*)$/\2\1/" Top10 | sed -r "s/([[:space:]])([[:digit:]])/\2/" | sed -r "s/([[:digit:]])([[:alpha:]])/\1 \2/" > Top10
sed -r -i "s/(.*)/\L\1/" Top10
sed -r -i "y/[aeiou]/[AEIOU]/" Top10
sort Top10 -g -o Top10
cat Top10

When I run it a few times the following happens:

As you can see sometimes the Top10 file turns out empty and sometimes it turn out the way I needed it to be. I know that the command which replaces the extensions from the end to the front of a line is done poorly. I ran this script on a VMware virtual machine. Could that be the reason?

Best Answer

  • More specifically, pipes are not deterministic.

    I.e. in a pipe such as this one:

    command1 file | command2 | command3 >file

    it's not guaranteed that command1 file will be executed before command3 >file.

    So the race condition between command1 file and command3 >file makes it so that sometimes the file is first read by command1 file and sometimes the file is first truncated by command3 >file, giving the expected output in the first case and a giving an empty output in the second case.

    This can be fixed by using sponge (in the moreutils package) to write the output to the file, to make sure that the output gets written to the file only after the rest of the pipe has finished executing:

    command1 file | command2 | command3 | sponge file
  • Related Question