awk Begin and End
AWK has several built-in variables. They are set by AWK when the program is run. We have already seen the NR
, $0
, and RSTART
variables.
$ awk 'BEGIN { print ARGC, ARGV[0], ARGV[1]}' mywords
2 awk mywords
The program prints the number of arguments of the AWK program and the first two arguments. ARGC
is the number of command-line arguments; in our case, there are two arguments including the AWK itself. ARGV
is an array of command-line arguments. The array is indexed from 0 to ARGC – 1.
FS
is an input field separator, a space by default. NF
is the number of fields in the current input record.
$ cat values
2, 53, 4, 16, 4, 23, 2, 7, 88
4, 5, 16, 42, 3, 7, 8, 39, 21
23, 43, 67, 12, 11, 33, 3, 6
We have three lines of comma-separated values.
BEGIN {
FS=","
max = 0
min = 10**10
sum = 0
avg = 0
}
{
for (i=1; i<=NF; i++) {
sum += $i
if (max < $i) {
max = $i
}
if (min > $i) {
min = $i
}
printf("%d ", $i)
}
}
END {
avg = sum / NF
printf("\n")
printf("Min: %d, Max: %d, Sum: %d, Average: %d\n", min, max, sum, avg)
}
The program counts the basic statistics from the provided values.
FS=","
The values in the file are separated by the comma character; therefore, we set the FS variable to comma character.
max = 0
min = 10**10
sum = 0
avg = 0
We define default values for the maximum, minimum, sum, and average. AWK variables are dynamic; their values are either floating-point numbers or strings or both, depending upon how they are used.
{
for (i=1; i<=NF; i++) {
sum += $i
if (max < $i) {
max = $i
}
if (min > $i) {
min = $i
}
printf("%d ", $i)
}
}
In the main part of the script, we go through each line and calculate the maximum, minimum, and the sum of the values. The NF is used to determine the number of values per line.
END {
avg = sum / NF
printf("\n")
printf("Min: %d, Max: %d, Sum: %d, Average: %d\n", min, max, sum, avg)
}
In the end part of the script, we calculate the average and print the calculations to the console.
$ awk -f stats.awk values
2 53 4 16 4 23 2 7 88 4 5 16 42 3 7 8 39 21 23 43 67 12 11 33 3 6
Min: 2, Max: 88, Sum: 542, Average: 67
This is the output of the stats.awk program.