Linux – With Bash variables, what is the difference between $thevar and “$thevar”? (Specific odd behavior)

bashcommand linelinux

General Question:

In Bash, I know that using the variable myvar can be done in two ways:

# Define a variable:
bash$ myvar="two words"

# Method one to dereference:
bash$ echo $myvar
two words

# Method two to dereference:
bash$ echo "$myvar"
two words

In the case above, the behavior is identical. This is because of how echo works. In other Unix utilities, whether the words are grouped together with double-quotes will make a huge difference:

bash$ myfile="Cool Song.mp3"
bash$ rm "$myfile"            # Deletes "Cool Song.mp3".
bash$ rm $myfile              # Tries to delete "Cool" and "Song.mp3".

I am wondering what the deeper significance of this difference is. Most importantly, how can I view exactly what will be passed to the command, so that I can see if it is quoted properly?

Specific Odd Example:

I will just write the code with the observed behavior:

bash$ mydate="--date=format:\"%Y-%m-%d T%H\""
bash$ git log "$mydate"    # This works great.
bash$ git log $mydate
fatal: ambiguous argument 'T%H"': unknown revision or path not in the working tree.

Why do I need the double-quotes? What exactly is git-log seeing after the variable is dereferenced without double-quotes?

But now see this:

bash$ nospace="--date=format:\"%Y-%m-%d\""
bash$ git log $nospace        # Now THIS works great.
bash$ git log "$nospace"      # This kind of works, here is a snippet:

# From git-log output:
Date:   "2018-04-12"

Yuck, why does the printed output have double-quotes in it now? It looks like if double-quotes are unnecessary, they do not get stripped, they are interpreted as literal quote characters if and only if they were not necessary.

What is Git being passed as arguments? I wish I knew how to find out.

To make matters more complex, I wrote a Python script using argparse that just prints all of the arguments (as Bash interpreted them, so with double-quote literals where Bash thinks they are part of the argument, and with words grouped or not grouped as Bash sees fit), and the Python argparse script behaves very rationally. Sadly, I think argparse may be silently fixing a known problem with Bash and thus obscuring the messed up stuff that Bash is passing to it. That's just a guess, I have no idea. Maybe git-log is secretly screwing up what Bash is passing to it.

Or maybe I just don't know what is going on at all.

Thanks.

Edited Edit: Let me say this now, before there are any answers: I know that I can maybe use single quotes around the whole thing and then not escape the double-quotes. This actually does work somewhat better for my initial problem using git-log, but I tested it in some other contexts and it is just about equally unpredictable and unreliable. Something weird is afoot with quoting inside variables. I'm not even going to post all the weird things that happened with single quotes.

Edit 2 – This also doesn't work: I just had this wonderful idea, but it doesn't work at all:

bash$ mydate="--date=format:%Y-%m-%d\ T%H"
bash$ git log "$mydate"

# Git log output has this:
Date:   2018-04-12\ T23

So it doesn't have quotes wrapping it, but it has a literal backslash character in the date string. Also, git log $mydate with no quotes errors out, with the backslash-space in the variable.

Best Answer

Different approach:

When you run git log --format="foo bar", those quotes aren't interpreted by git – they're removed by the shell (and protect the quoted text from splitting). This results in a single arg:

  • --format=foo bar

However, when unquoted variables are expanded, the results go through word-splitting, but not through unquoting. So if your variable contains --format="foo bar", it is expanded into these args:

  • --format="foo
  • bar"

This can be verified using:

  • printf '%s\n' $variable

...as well as any simple script which prints its received arguments.

  • #!/usr/bin/env perl
    for $i (0..$#ARGV) {
        print ($i+1)." = ".$ARGV[$i]."\n";
    }
    
  • #!/usr/bin/env python3
    import sys
    for i, arg in enumerate(sys.argv):
        print(i, "=", arg)
    

If you always have bash available, the preferred workaround is to use array variables:

myvar=( --format="foo bar" )

With this, the usual parsing is done during assignment, not during expansion. You use this syntax to expand the variable's contents, each element getting its own arg:

git log "${myvar[@]}"