Skip to content
  • About
  • CoursesExpand
    • Problem Solving using C Language
    • Mastering Database Management
    • Linux System Administration
    • Linux and Shell Programming
  • Publications
  • Professional Certificates
  • BooksExpand
    • Books Authored
  • Patents
Download CV
Unix

awk Spell Checking

Spell checking

We create an AWK program for spell checking.

BEGIN {
    count = 0
    
    i = 0
    while (getline myword <"/usr/share/dict/words") {
        dict[i] = myword
        i++
    }
}

{
    for (i=1; i<=NF; i++) {
    
        field = $i
    
        if (match(field, /[[:punct:]]$/)) {
            field = substr(field, 0, RSTART-1)
        }
    
        mywords[count] = field
        count++
    }
}

END {

    for (w_i in mywords) { 
        for (w_j in dict) { 
            if (mywords[w_i] == dict[w_j] || 
                        tolower(mywords[w_i]) == dict[w_j]) {
                delete mywords[w_i]
            }
        }
    }

    for (w_i in mywords) { 
        if (mywords[w_i] != "") {
            print mywords[w_i]        
        }
    }
}

The script compares the words of the provided text file against a dictionary. Under the standard /usr/share/dict/words path we can find an English dictionary; each word is on a separate line.

BEGIN {
    count = 0
    
    i = 0
    while (getline myword <"/usr/share/dict/words") {
        dict[i] = myword
        i++
    }
}

Inside the BEGIN block, we read the words from the dictionary into the dict array. The getline command reads a record from the given file name; the record is stored in the $0 variable.

{
    for (i=1; i<=NF; i++) {
    
        field = $i
    
        if (match(field, /[[:punct:]]$/)) {
            field = substr(field, 0, RSTART-1)
        }
    
        mywords[count] = field
        count++
    }
}

In the main part of the program, we place the words of the file that we are spell checking into the mywords array. We remove any punctuation marks (like commas or dots) from the endings of the words.

END {

    for (w_i in mywords) { 
        for (w_j in dict) { 
            if (mywords[w_i] == dict[w_j] || 
                        tolower(mywords[w_i]) == dict[w_j]) {
                delete mywords[w_i]
            }
        }
    }
...
}    

We compare the words from the mywords array against the dictionary array. If the word is in the dictionary, it is removed with the delete command. Words that begin a sentence start with an uppercase letter; therefore, we also check for a lowercase alternative utilizing the tolower() function.

for (w_i in mywords) { 
    if (mywords[w_i] != "") {
        print mywords[w_i]        
    }
}

Remaining words have not been found in the dictionary; they are printed to the console.

$ awk -f spellcheck.awk text
consciosness
finaly

We have run the program on a text file; we have found two misspelled words. Note that the program takes some time to finish.

Post navigation

Previous Previous
awk Pipes
NextContinue
awk Rock-Paper-Scissors
Latest

Advance AI PPT

Read More Advance AI PPTContinue

Latest

Prompts for Image Descriptions

Describe the scene using three vivid sensory details — one for sight, one for sound, and one for touch. Summarize the mood of the image…

Read More Prompts for Image DescriptionsContinue

Latest

Dimensionality Reduction

Dimensionality reduction is the process of reducing the number of features (variables) in a dataset while preserving important information. It helps in: ✅ Reducing computational…

Read More Dimensionality ReductionContinue

Artificial Intelligence

Tanh Function in Neural Network

The tanh function, short for hyperbolic tangent function, is another commonly used activation function in neural networks. It maps any real-valued number into a value…

Read More Tanh Function in Neural NetworkContinue

Latest

Why Initialize Weights in Neural Network

Initializing weights and biases is a crucial step in building a neural network. Proper initialization helps ensure that the network converges to a good solution…

Read More Why Initialize Weights in Neural NetworkContinue

Nishant Munjal

Coding Humanity’s Future </>

Facebook Twitter Linkedin YouTube Github Email

Tools

  • SIP Calculator
  • Write with AI
  • SamplePHP
  • Image Converter

Resources

  • Blog
  • Contact
  • Refund and Returns

Legal

  • Disclaimer
  • Privacy Policy
  • Terms and Conditions

© 2025 - All Rights Reserved

  • About
  • Courses
    • Problem Solving using C Language
    • Mastering Database Management
    • Linux System Administration
    • Linux and Shell Programming
  • Publications
  • Professional Certificates
  • Books
    • Books Authored
  • Patents
Download CV
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.Ok