reported nasty bash bug -- arithmetic problem, incorrect return code with ((var++)) when var is 0 before operation

Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I../. -I.././include -I.././lib -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wall uname output: Linux andrewm-Satellite-P50-A 4.4.0-72-generic #93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.3 Patch Level: 42 Release Status: release Description: The return code from ((i++)) operation is different when i has an initial value of 0. This problem was discovered when running run a bash script with "set -ue" and having "((i++))" as the last command in a loop. Repeat-By: The following script will quit with an error after the first iteration due to the return code errantly being 1. #!/bin/bash set -eu i=0 for a in a b c do echo "${a}" ((i++)) done NB: If i starts as 1, the script works as expected. Also, using ((i+=1)) works fine, but I prefer the ((i++)) format. Proof outside a loop: $ c=0;((c++));echo $? 1 $ c=1;((c++));echo $? 0 Only the above was reported, but this also yeilds more proof: $ for c in {-3..3};do ((c++));echo $? "${c}";done 0 -2 0 -1 0 0 1 1 0 2 0 3 0 4 Cheers AndrewM

On 23 April 2017 at 03:59, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
The return code from ((i++)) operation is different when i has an initial value of 0.
Well, that is by consistent design, so I wouldn't consider that a bug, myself. Bash provides pre- and post- increment and decrement operators, and you seem to have chosen the wrong one. $ i=0; ((i++)); echo $? 1 $ i=0; ((++i)); echo $? 0 Search 'man bash' for the string "pre-increment".

On 23/04/17 10:32, David via luv-main wrote:
On 23 April 2017 at 03:59, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
The return code from ((i++)) operation is different when i has an initial value of 0.
Well, that is by consistent design, so I wouldn't consider that a bug, myself.
Bash provides pre- and post- increment and decrement operators, and you seem to have chosen the wrong one.
$ i=0; ((i++)); echo $? 1 $ i=0; ((++i)); echo $? 0
Search 'man bash' for the string "pre-increment".
Pre or post makes no real difference. The problem is that "expression", in this case "i" ... causes the return value and not the operation of incrementing the expression. Using ((--i)) gives the problem when i was 1 before the increment and using ((i++)) gives the problem when i was 0 before the increment. Anyway, I got the following response to my report via bashbu, to which I still think is questionable. <quote> On Sat, Apr 22, 2017 at 12:49 PM, Andrew McGlashan <andrew.mcglashan@affinityvision.com.au> wrote: [...]
The return code from ((i++)) operation is different when i has an initial value of 0.
This is not a bug. Please read the Bash reference manual section on conditional constructs: https://www.gnu.org/software/bash/manual/bash.html#index-commands_002c-condi... | The arithmetic expression is evaluated according to the rules described below (see Shell Arithmetic). If the value of the expression is non-zero, the return status is 0; otherwise the return status is 1. This is exactly equivalent to | | let "expression" This is also described in other sources: 1. http://wiki.bash-hackers.org/commands/builtin/let#description 2. http://mywiki.wooledge.org/ArithmeticExpression#Arithmetic_Commands Also see the 1st example in http://mywiki.wooledge.org/BashFAQ/105 ("Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?"), which mentions this pitfall. You can also find extensive discussions in the bug-bash archives on why you shouldn't be using set -e, regardless of how many "sources" on the web claim that it helps write "robust" or "correct" shell scripts. </quote> Kind Regards AndrewM

Andrew McGlashan via luv-main <luv-main@luv.asn.au> writes:
You can also find extensive discussions in the bug-bash archives on why you shouldn't be using set -e, regardless of how many "sources" on the web claim that it helps write "robust" or "correct" shell scripts.
If this sort of thing matters to you, it is possibly time for a rewrite in a "real" programming language, such as Python, Perl, Ruby, JavaScript, etc. bash/sh scripts are good for very simple things - e.g. running an identical sequence of commands exactly the same every time. -e even works very well in such cases - I prefer using -e when possible - if one of the commands in the sequence fails, I don't want the sequence to continue. However, things tend to get very arcade very quickly with anything slightly more complicated. As this example clearly shows, with return codes sometimes being used to return non-error information (e.g. test), and sometimes being used to convey error/exception information. -e cannot tell the difference. Or (another common problem I have had) if you want to deal with optional parameters that might contain spaces. Yes, bash arrays will help, but I dislike bash arrays. -- Brian May <brian@linuxpenguins.xyz> https://linuxpenguins.xyz/brian/

Quoting Andrew McGlashan (andrew.mcglashan@affinityvision.com.au):
Also see the 1st example in http://mywiki.wooledge.org/BashFAQ/105 ("Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?"), which mentions this pitfall.
This is yet another subtle shell pitfall, and I heartily recommend pre-emptively reading the Bash FAQ to learn from other people's errors. ;-> (It's slightly cringe-worthy that Debian's /etc/cron.weekly/sysklogd logfile rotation script has 'set -e' right up near the top, right under some wiser 'set -x' lines to verify that needed utilities exist and are executable. Fortunately, it doesn't attempt anything complex.) -- Rick Moen "Use 'bologna' for the meat, 'baloney' for rick@linuxmafia.com anything creationists say." -- @FakeAPStylebook McQ! (4x80)

On 23 April 2017 at 03:59, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
This problem was discovered when running run a bash script with "set -ue" and having "((i++))" as the last command in a loop.
The following script will quit with an error after the first iteration due to the return code errantly being 1.
#!/bin/bash set -eu i=0 for a in a b c do echo "${a}" ((i++)) done
The "bug" you have discovered is expected behaviour for someone more experienced in this aspect of bash. On 23 April 2017 at 13:17, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
Pre or post makes no real difference.
But, doesn't the below script exhibit exactly the behaviour you wanted? set -eu i=0 for a in a b c do echo "${a}" ((++i)) done Also, the advice you have been given is good, dont use 'set -e' (for the reasons given in the other 2 replies, it causes trouble and is unecessary). It takes perseverance to master bash scripting because there are many unexpected non-obvious pitfalls and much of the information found by web searching encourages bad practice. Lurk on irc.freenode.net #bash for a while if you want to get better at it.

If you insist on using 'set -e', you could do this to disable exitstatus checking on arithmetic contexts: set -eu i=0 for a in a b c do echo "${a}" ((i++)) || true done

Hi, On 23/04/17 14:42, David via luv-main wrote:
If you insist on using 'set -e', you could do this to disable exitstatus checking on arithmetic contexts:
set -eu i=0 for a in a b c do echo "${a}" ((i++)) || true done
Yes, that would be the cleanest way to solve this. Thanks. I solved it a different way though: for ((i=0;i<${#arrayx[@]};i++)) do .... done The reason why I need the index is because there is a different array that I rely upon that has corresponding (1 to 1) values relative to the first array. I concede that there are multiple "correct" answers, but can only agree to disagree on ones that I believe are at odds with being sane. My view is that: "I am asking bash to increment a number, ANY number" The number is incremented successfully (in every case), but the value of $? (return status) is different depending on the value of the number given for bash to increment. That's where I see it as wrong. The only situation where I think it should return a failure status is when incrementing the number by 1 causes the result to be a negative number due to wrap around of the value in the variable such as this one: https://en.wikipedia.org/wiki/9223372036854775807 As I understand it, bash should only ever give a return status that indicates failure if it was not able to increment the given variable; there is no valid reason to do otherwise. Cheers A.

On 23 April 2017 at 15:35, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
As I understand it, bash should only ever give a return status that indicates failure if it was not able to increment the given variable; there is no valid reason to do otherwise.
Hi One very valid reason is so that constructs like this are possible: if (( variable_or_expression )) ; then do this else do that fi while (( variable_or_expression )) ; do something done until (( variable_or_expression )) ; do something done It all works as documented in 'man bash' :) You have encounterd not so much a bug as a conflict between two incompatible features that both detect nonzero exitstatus. 'set -e' is a well intentioned feature that turns out to be incompatible with useful applications of exitstatus. 'set -e' is less useful than the above, and has other deficiencies. 'set -e' should only be used in very dumb scripts, and experts recommend is best avoided altogether. Read 'man bash' description of 'set -e' to see how fiddly it is. Effective error handling requires more effort that just lobbing 'set -e' at the top of the script and expecting everything to just work by itself. If you really must, consider using 'set -e' locally around a statement: set -e some_statement set +e But it is far preferable and not particularly difficult to do this instead: some_statement || error_handling_goes_here Don't take my word for it, read the references you've been given :) Also, if you are using two indexed arrays that share a common index, it might be worth considering using an associative array instead, if you only need a unidirectional mapping?

On 23 April 2017 at 15:35, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
Hi,
On 23/04/17 14:42, David via luv-main wrote:
If you insist on using 'set -e', you could do this to disable exitstatus checking on arithmetic contexts:
set -eu i=0 for a in a b c do echo "${a}" ((i++)) || true done
Yes, that would be the cleanest way to solve this. Thanks.
I disagree, I think it is a bit hackish, but it is a universal solution to the collision of trying to use 'set -e' together with unknown arithmetic results that might include zero, so I thought it was worth adding to the discussion. But in your example use-case, the arithmetic results are entirely predictable. In the message I sent before the one above, I gave what seemed to me the cleanest way to do what you originally asked, using pre-increment as I suggested in my first reply: On 23 April 2017 at 14:14, David <bouncingcats@gmail.com> wrote:
But, doesn't the below script exhibit exactly the behaviour you wanted?
set -eu i=0 for a in a b c do echo "${a}" ((++i)) done
I am writing again because I can't see anywhere you have responded to this suggestion. I get the impression you insist on using the post-increment "i++". Is that because it is more familiar from other languages? The problem is that in bash, this means "check the exitstatus and then increment i". That's not what you need, so why do it? If you use the pre-increment operator, "++i", it increments i and then checks the exitstatus. Which is exactly what you need and want in your original script. It is the simple and appropriate solution. $ i=0; ((i++)); echo "i=$i exitstatus=$?" i=1 exitstatus=1 $ i=0; ((++i)); echo "i=$i exitstatus=$?" i=1 exitstatus=0 Unless I am missing something? :) The 'for' loop solution you showed is fine if you want to do it that way, but it does not solve the original issue you put, which did not use any array. It does let you use "i++" though :)

Hi, On 24/04/17 10:24, David via luv-main wrote:
On 23 April 2017 at 15:35, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
Hi,
On 23/04/17 14:42, David via luv-main wrote:
If you insist on using 'set -e', you could do this to disable exitstatus checking on arithmetic contexts:
set -eu i=0 for a in a b c do echo "${a}" ((i++)) || true done
Yes, that would be the cleanest way to solve this. Thanks.
I disagree, I think it is a bit hackish, but it is a universal solution to the collision of trying to use 'set -e' together with unknown arithmetic results that might include zero, so I thought it was worth adding to the discussion.
But in your example use-case, the arithmetic results are entirely predictable.
In the message I sent before the one above, I gave what seemed to me the cleanest way to do what you originally asked, using pre-increment as I suggested in my first reply:
Yes, I did see that and I tried it out; pre-increment gives problems just the same as post-increment. So that wasn't a solution. There is obvious debate on both sides and I can understand the various arguments for both points -- using set -x or not AND using the ((i++)) construct or not, To me, arithmetic math with the use of the double round brackets is, due to the other use of the construct, a mistake to have been implemented that way. Using "((i++)) || true" allows me to say, I don't care if the arithmetic result is zero, I am handling the condition for loop exit outside of that result and the actual return from the ((i++)) is totally irrelevant. The other alternative that was good was this one: for ((i=0; i<${#array[@]}, i++)) do ... done Anyway, I am using the C construct with the for loop now as it is slightly better IMHO. But I would have no problem using the other solution of: ((i++)) || true - when it suits... Thanks and Cheers AndrewM

Hi, Below is the actual code that I am using FWIW, I don't claim it is perfect, but it does the job of collecting the WinSCP files of a release, as well as compiling checksum files from the release's readme file. The format of the readme file has been constant for quite some time (if not since it was first devised). The files I want haven't changed either. Of course it would be easier if they released proper md5sum, sha1sum and sha256sum files so that I didn't need to construct them. The download version is within the file name of the shell script. I could just as easily use the directory name to determine the version though. Basically I create a directory, copy in or hard link the file from "last time", adjust it if necessary (name at least) and run it. The script seems to be quite effective in getting all the files and confirming they are valid according to the checksums provided in the readme. I actually haven't used WinSCP itself for a while now as I only use Windows for very few things these days (aside from when I am working on other people's machines providing IT support services). Cheers A. $ cat get-files--5.9.5.sh #!/bin/bash set -eu get_files() { for TARGET_FILE in ${files_to_download} do while [ ! -f "${TARGET_FILE}" ] do FILE_URL="${URL_BASE}/download/${TARGET_FILE}" CURL_SAVE_FILE="${TARGET_FILE}--curl-ILs.out" curl -ILs "${FILE_URL}" | \ tr -d "\r" | \ tee "${CURL_SAVE_FILE}" FINAL_DOWNLOAD_URL=$( grep -i ^Location: "${CURL_SAVE_FILE}" | \ tail -1 | \ cut -d" " -f2 ) [[ -z "${FINAL_DOWNLOAD_URL}" ]] && \ FINAL_DOWNLOAD_URL="${FILE_URL}" echo " -------------------------------------------" echo " File URL: ${FILE_URL}" echo "Final Download URL: ${FINAL_DOWNLOAD_URL}" echo wget -c "${FINAL_DOWNLOAD_URL}" echo done done return 0 } make_checksum_files() { declare -a files_with_checksums checksums files_with_checksums=( $( grep ^W "${README_FILE}" | tr -d "\r" ) ) for sum_type in MD5 SHA-1 SHA-256 do checksums=( $( grep "${sum_type}" "${README_FILE}" | \ tr -d "\r" | \ cut -d" " -f4 ) ) case "${sum_type}" in MD5) OUT_FILE=md5sums ;; SHA-1) OUT_FILE=sha1sums ;; SHA-256) OUT_FILE=sha256sums ;; esac for ((i=0;i<${#files_with_checksums[@]};i++)) do printf "%s %s\n" \ "${checksums[${i}]}" \ "${files_with_checksums[${i}]}" done | tee "${OUT_FILE}" touch -r "${README_FILE}" "${OUT_FILE}" echo done return 0 } FN=$0 echo "${FN}" # Strip prefix up to double dash VER=${FN#*--} URL_BASE=https://winscp.net/ # Strip suffix .sh VER=${VER%%.sh} files_to_download=" WinSCP-${VER}-Automation.zip WinSCP-${VER}-Portable.zip WinSCP-${VER}-ReadMe.txt WinSCP-${VER}-Setup.exe WinSCP-${VER}-Source.zip " ( echo echo "WinSCP Website: ${URL_BASE}" echo echo "-- download version: $VER" echo get_files echo " -------------------------------------------" README_FILE=WinSCP-${VER}-ReadMe.txt [ -f "${README_FILE}" ] || { echo Missing "${README_FILE}" exit } echo cat "${README_FILE}" echo echo make_checksum_files echo echo for sum_file in md5sums sha1sums sha256sums do [ -f "${sum_file}" ] || { echo missing "${sum_file}" exit } echo checking "${sum_file}" .... ${sum_file:0:(-1)} -c "${sum_file}" echo done echo chown -R andrewm:andrewm . ls -lart ) 2>&1 | tee "$(basename "$0")".out

On 27 April 2017 at 02:45, Andrew McGlashan via luv-main <luv-main@luv.asn.au> wrote:
On 24/04/17 10:24, David via luv-main wrote:
In the message I sent before the one above, I gave what seemed to me the cleanest way to do what you originally asked, using pre-increment as I suggested in my first reply:
[...]
Yes, I did see that and I tried it out; pre-increment gives problems just the same as post-increment. So that wasn't a solution.
Thanks for the update. [...]
Using "((i++)) || true" allows me to say, I don't care if the arithmetic result is zero, I am handling the condition for loop exit outside of that result and the actual return from the ((i++)) is totally irrelevant.
Thanks for the feedback, I'm glad you found that useful.

On 23.04.17 13:17, Andrew McGlashan via luv-main wrote:
Anyway, I got the following response to my report via bashbu, to which I still think is questionable.
<quote> ... | The arithmetic expression is evaluated according to the rules described below (see Shell Arithmetic). If the value of the expression is non-zero, the return status is 0; otherwise the return status is 1. This is exactly equivalent to | | let "expression"
Andrew, that makes crystal clear that the only problem is a failure to read the manual before forming a firm opinion. Interpreting a documented feature as a "problem" isn't entirely useful. ;-) ISTM that the feature usefully returns two values simultaneously; the incremented variable, and zero detection. That could be used for exiting a simple loop, detecting convergence of an iterative function, etc. Given that any modulo 2^n binary counter can increment/decrement infinitely, merely overflowing/underflowing at 0, it can never ever "fail to increment/decrement", if implemented as a simple CPU ALU operation, or multiple thereof. The documented feature is a simple way to communicate that overflow/underflow. That's useful. ISTM, then, that it is the use of "set -e" which is the problem. It is not documented to detect errors, but merely "exits with a non-zero status". If it is needed elsewhere in the script, have you considered setting it later, or using an unset/set around the arithmetic? Erik
participants (5)
-
Andrew McGlashan
-
Brian May
-
David
-
Erik Christiansen
-
Rick Moen