awk Marking keywords
Marking keywords
In the following example, we mark Java keywords in a source file.
# the program adds tags around Java keywords
# it works on keywords that are separate words
BEGIN {
# load java keywords
i = 0
while (getline kwd <"javakeywords2") {
keywords[i] = kwd
i++
}
}
{
mtch = 0
ln = ""
space = ""
# calculate the beginning space
if (match($0, /[^[:space:]]/)) {
if (RSTART > 1) {
space = sprintf("%*s", RSTART, "")
}
}
# add the space to the line
ln = ln space
for (i=1; i <= NF; i++) {
field = $i
# go through keywords
for (w_i in keywords) {
kwd = keywords[w_i]
# check if a field is a keyword
if (field == kwd) {
mtch = 1
}
}
# add tags to the line
if (mtch == 1) {
ln = ln "<kwd>" field "</kwd> "
} else {
ln = ln field " "
}
mtch = 0
}
print ln
}
The program adds and tags around each of the keywords that it recognizes. This is a basic example; it works on keywords that are separate words. It does not address the more complicated structures.
# load java keywords
i = 0
while (getline kwd <"javakeywords2") {
keywords[i] = kwd
i++
}
We load Java keywords from a file; each keyword is on a separate line. The keywords are stored in the keywords
array.
# calculate the beginning space
if (match($0, /[^[:space:]]/)) {
if (RSTART > 1) {
space = sprintf("%*s", RSTART, "")
}
}
Using regular expression, we calculate the space at the beginning of the line if any. The space is a string variable equaling to the width of the space at the current line. The space
is calculated in order to keep the indentation of the program.
# add the space to the line
ln = ln space
The space is added to the ln variable. In
AWK, we use a space to add strings.
for (i=1; i <= NF; i++) {
field = $i
...
}
We go through the fields of the current line; the field in question is stored in the field
variable.
# go through keywords
for (w_i in keywords) {
kwd = keywords[w_i]
# check if a field is a keyword
if (field == kwd) {
mtch = 1
}
}
In a for loop, we go through the Java keywords and check if a field is a Java keyword.
# add tags to the line
if (mtch == 1) {
ln = ln "<kwd>" field "</kwd> "
} else {
ln = ln field " "
}
If there is a keyword, we attach the tags around the keyword; otherwise we just append the field to the line.
print ln
The constructed line is printed to the console.
$ awk -f markkeywords2.awk program.java
<kwd>package</kwd> com.zetcode;
<kwd>class</kwd> Test {
<kwd>int</kwd> x = 1;
<kwd>public</kwd> <kwd>void</kwd> exec1() {
System.out.println(this.x);
System.out.println(x);
}
<kwd>public</kwd> <kwd>void</kwd> exec2() {
<kwd>int</kwd> z = 5;
System.out.println(x);
System.out.println(z);
}
}
<kwd>public</kwd> <kwd>class</kwd> MethodScope {
<kwd>public</kwd> <kwd>static</kwd> <kwd>void</kwd> main(String[] args) {
Test ts = <kwd>new</kwd> Test();
ts.exec1();
ts.exec2();
}
}
A sample run on a small Java program.