Skip to content
Nishant Munjal
  • About
  • NMRIL Labs
  • Courses
  • ResearchExpand
    • Research Publications
    • Books
    • Patents
    • Ph.D. Supervised
  • Workshop/Conferences
  • ToolsExpand
    • Creative Image Converter
    • Creative QRCode Generator
    • Creative QR Code Generator Tool
    • EMI Calculator
    • SIP Calculator
  • Blog
  • Resume
One Page CV
Nishant Munjal
Unix

awk Marking keywords

Marking keywords

In the following example, we mark Java keywords in a source file.

# the program adds tags around Java keywords
# it works on keywords that are separate words

BEGIN {

    # load java keywords
    i = 0
    while (getline kwd <"javakeywords2") {
        keywords[i] = kwd
        i++
    }
}

{
    mtch = 0
    ln = ""
    space = ""
    
    # calculate the beginning space
    if (match($0, /[^[:space:]]/)) {
        if (RSTART > 1) {
            space = sprintf("%*s", RSTART, "") 
        }
    }     
    
    # add the space to the line
    ln = ln space
    
    for (i=1; i <= NF; i++) {
    
        field = $i
         
        # go through keywords   
        for (w_i in keywords) { 
        
            kwd = keywords[w_i]
            
            # check if a field is a keyword
            if (field == kwd) {
                mtch = 1     
            } 
        }
        
        # add tags to the line        
        if (mtch == 1) {
            ln = ln  "<kwd>" field  "</kwd> "   
        } else {
            ln = ln field " " 
        }
        
        mtch = 0
            
    }
    
    print ln
}

The program adds and tags around each of the keywords that it recognizes. This is a basic example; it works on keywords that are separate words. It does not address the more complicated structures.

# load java keywords
i = 0
while (getline kwd <"javakeywords2") {
    keywords[i] = kwd
    i++
}

We load Java keywords from a file; each keyword is on a separate line. The keywords are stored in the keywords array.

# calculate the beginning space
if (match($0, /[^[:space:]]/)) {
    if (RSTART > 1) {
        space = sprintf("%*s", RSTART, "") 
    }
}        

Using regular expression, we calculate the space at the beginning of the line if any. The space is a string variable equaling to the width of the space at the current line. The space is calculated in order to keep the indentation of the program.

# add the space to the line
ln = ln space   

The space is added to the ln variable. In AWK, we use a space to add strings.

for (i=1; i <= NF; i++) {

field = $i
...
}

We go through the fields of the current line; the field in question is stored in the field variable.

# go through keywords   
for (w_i in keywords) { 

    kwd = keywords[w_i]
    
    # check if a field is a keyword
    if (field == kwd) {
        mtch = 1     
    } 
}

In a for loop, we go through the Java keywords and check if a field is a Java keyword.

# add tags to the line        
if (mtch == 1) {
    ln = ln  "<kwd>" field  "</kwd> "   
} else {
    ln = ln field " " 
}

If there is a keyword, we attach the tags around the keyword; otherwise we just append the field to the line.

print ln

The constructed line is printed to the console.

$ awk -f markkeywords2.awk program.java 
<kwd>package</kwd> com.zetcode; 

<kwd>class</kwd> Test { 

     <kwd>int</kwd> x = 1; 

     <kwd>public</kwd> <kwd>void</kwd> exec1() { 

         System.out.println(this.x); 
         System.out.println(x); 
     } 

     <kwd>public</kwd> <kwd>void</kwd> exec2() { 

         <kwd>int</kwd> z = 5; 

         System.out.println(x); 
         System.out.println(z); 
     } 
} 

<kwd>public</kwd> <kwd>class</kwd> MethodScope { 

     <kwd>public</kwd> <kwd>static</kwd> <kwd>void</kwd> main(String[] args) { 

         Test ts = <kwd>new</kwd> Test(); 
         ts.exec1(); 
         ts.exec2(); 
     } 
} 

A sample run on a small Java program.

Post navigation

Previous Previous
awk Rock-Paper-Scissors
NextContinue
‘in’ statement in String
  • Latest

    What is CSP Bypass?

    What is CSP Bypass? CSP (Content Security Policy) is a browser security feature that tries to stop attacks like XSS (Cross-Site Scripting) by controlling what…

    Read More What is CSP Bypass?Continue

  • Cybersecurity

    Cybersecurity Tools

    Cybersecurity Tools 1️⃣ Digital Forensics Tools Tool Name Type Used For Who Uses Autopsy Computer forensics Analyze hard disk, recover deleted files Investigators FTK Imager…

    Read More Cybersecurity ToolsContinue

  • Cybersecurity

    Command Injection Attack

    Command Injection Attack A Command Injection attack happens when a web application takes user input and passes it to the system shell (Linux/Windows command line)…

    Read More Command Injection AttackContinue

  • Cybersecurity

    CSRF in DVWA (for learning/demo)

    CSRF in DVWA (for learning/demo) CSRF: An attack where a logged-in user is tricked into sending unwanted requests to a web application, causing actions to…

    Read More CSRF in DVWA (for learning/demo)Continue

  • Cybersecurity

    XSS = Cross-Site Scripting using DVWA

    XSS = Cross-Site Scripting using DVWA It allows an attacker to inject JavaScript into a web page so that it runs in another user’s browser….

    Read More XSS = Cross-Site Scripting using DVWAContinue

Nishant Munjal

Coding Humanity’s Future </>


Facebook Twitter Linkedin YouTube Github Email

Tools

  • SIP Calculator
  • EMI Calculator
  • Creative QR Code
  • Image Converter

Resources

  • Blog
  • Contact
  • Refund and Returns

Legal

  • Disclaimer
  • Privacy Policy
  • Terms and Conditions

© 2026 Nishant Munjal - All Rights Reserved

  • About
  • NMRIL Labs
  • Courses
  • Research
    • Research Publications
    • Books
    • Patents
    • Ph.D. Supervised
  • Workshop/Conferences
  • Tools
    • Creative Image Converter
    • Creative QRCode Generator
    • Creative QR Code Generator Tool
    • EMI Calculator
    • SIP Calculator
  • Blog
  • Resume
Download CV
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.