Skip to content
  • About
  • CoursesExpand
    • Problem Solving using C Language
    • Mastering Database Management
    • Linux System Administration
    • Linux and Shell Programming
  • Publications
  • Professional Certificates
  • BooksExpand
    • Books Authored
  • Patents
Download CV
Unix

awk Marking keywords

Marking keywords

In the following example, we mark Java keywords in a source file.

# the program adds tags around Java keywords
# it works on keywords that are separate words

BEGIN {

    # load java keywords
    i = 0
    while (getline kwd <"javakeywords2") {
        keywords[i] = kwd
        i++
    }
}

{
    mtch = 0
    ln = ""
    space = ""
    
    # calculate the beginning space
    if (match($0, /[^[:space:]]/)) {
        if (RSTART > 1) {
            space = sprintf("%*s", RSTART, "") 
        }
    }     
    
    # add the space to the line
    ln = ln space
    
    for (i=1; i <= NF; i++) {
    
        field = $i
         
        # go through keywords   
        for (w_i in keywords) { 
        
            kwd = keywords[w_i]
            
            # check if a field is a keyword
            if (field == kwd) {
                mtch = 1     
            } 
        }
        
        # add tags to the line        
        if (mtch == 1) {
            ln = ln  "<kwd>" field  "</kwd> "   
        } else {
            ln = ln field " " 
        }
        
        mtch = 0
            
    }
    
    print ln
}

The program adds and tags around each of the keywords that it recognizes. This is a basic example; it works on keywords that are separate words. It does not address the more complicated structures.

# load java keywords
i = 0
while (getline kwd <"javakeywords2") {
    keywords[i] = kwd
    i++
}

We load Java keywords from a file; each keyword is on a separate line. The keywords are stored in the keywords array.

# calculate the beginning space
if (match($0, /[^[:space:]]/)) {
    if (RSTART > 1) {
        space = sprintf("%*s", RSTART, "") 
    }
}        

Using regular expression, we calculate the space at the beginning of the line if any. The space is a string variable equaling to the width of the space at the current line. The space is calculated in order to keep the indentation of the program.

# add the space to the line
ln = ln space   

The space is added to the ln variable. In AWK, we use a space to add strings.

for (i=1; i <= NF; i++) {

field = $i
...
}

We go through the fields of the current line; the field in question is stored in the field variable.

# go through keywords   
for (w_i in keywords) { 

    kwd = keywords[w_i]
    
    # check if a field is a keyword
    if (field == kwd) {
        mtch = 1     
    } 
}

In a for loop, we go through the Java keywords and check if a field is a Java keyword.

# add tags to the line        
if (mtch == 1) {
    ln = ln  "<kwd>" field  "</kwd> "   
} else {
    ln = ln field " " 
}

If there is a keyword, we attach the tags around the keyword; otherwise we just append the field to the line.

print ln

The constructed line is printed to the console.

$ awk -f markkeywords2.awk program.java 
<kwd>package</kwd> com.zetcode; 

<kwd>class</kwd> Test { 

     <kwd>int</kwd> x = 1; 

     <kwd>public</kwd> <kwd>void</kwd> exec1() { 

         System.out.println(this.x); 
         System.out.println(x); 
     } 

     <kwd>public</kwd> <kwd>void</kwd> exec2() { 

         <kwd>int</kwd> z = 5; 

         System.out.println(x); 
         System.out.println(z); 
     } 
} 

<kwd>public</kwd> <kwd>class</kwd> MethodScope { 

     <kwd>public</kwd> <kwd>static</kwd> <kwd>void</kwd> main(String[] args) { 

         Test ts = <kwd>new</kwd> Test(); 
         ts.exec1(); 
         ts.exec2(); 
     } 
} 

A sample run on a small Java program.

Post navigation

Previous Previous
awk Rock-Paper-Scissors
NextContinue
‘in’ statement in String
Latest

Advance AI PPT

Read More Advance AI PPTContinue

Latest

Prompts for Image Descriptions

Describe the scene using three vivid sensory details — one for sight, one for sound, and one for touch. Summarize the mood of the image…

Read More Prompts for Image DescriptionsContinue

Latest

Dimensionality Reduction

Dimensionality reduction is the process of reducing the number of features (variables) in a dataset while preserving important information. It helps in: ✅ Reducing computational…

Read More Dimensionality ReductionContinue

Artificial Intelligence

Tanh Function in Neural Network

The tanh function, short for hyperbolic tangent function, is another commonly used activation function in neural networks. It maps any real-valued number into a value…

Read More Tanh Function in Neural NetworkContinue

Latest

Why Initialize Weights in Neural Network

Initializing weights and biases is a crucial step in building a neural network. Proper initialization helps ensure that the network converges to a good solution…

Read More Why Initialize Weights in Neural NetworkContinue

Nishant Munjal

Coding Humanity’s Future </>

Facebook Twitter Linkedin YouTube Github Email

Tools

  • SIP Calculator
  • Write with AI
  • SamplePHP
  • Image Converter

Resources

  • Blog
  • Contact
  • Refund and Returns

Legal

  • Disclaimer
  • Privacy Policy
  • Terms and Conditions

© 2025 - All Rights Reserved

  • About
  • Courses
    • Problem Solving using C Language
    • Mastering Database Management
    • Linux System Administration
    • Linux and Shell Programming
  • Publications
  • Professional Certificates
  • Books
    • Books Authored
  • Patents
Download CV
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.Ok