Class Driver

java.lang.Object
textprocessing.Driver

public class Driver extends Object
A Driver class for processing text from a file.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private static final String
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    private static void
    addBigrams(List<Word> bigrams, List<BasicWord> words)
    A helper method that generates Bigrams from the ordered List of BasicWords and stores the Bigrams in a List.
    private static void
    addVocabulary(List<Word> vocabulary, List<BasicWord> words)
    A helper method that generates VocabularyEntry objects from the ordered List of BasicWords and stores the entries in a List.
    private static void
    addWords(List<BasicWord> words, Scanner read)
    A helper method that reads words from a text file one at a time and stores the normalized words in a List of BasicWords.
    private static String
    getInput(Scanner in, String message)
    A helper method to get input from the user.
    static void
    main(String[] args)
     
    private static String
    A helper method that removes all punctuation from a String and converts the resulting punctuation-less String to lowercase
    private static void
    A helper method to remove the header information from a Project Gutenberg file.
    private static void
    report(List<Word> list, String type, int topHits)
    A helper method that generates a report of the most frequent entries from the given sorted List.
    private static void
    saveFile(List<Word> list, File output)
    A helper method to save a List of Words as a text file

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • Driver

      public Driver()
  • Method Details

    • main

      public static void main(String[] args)
    • getInput

      private static String getInput(Scanner in, String message)
      A helper method to get input from the user.
      Parameters:
      in - Scanner using System.in as input
      message - the message with which to prompt the user
      Returns:
      the user input
    • removeHeader

      private static void removeHeader(Scanner read)
      A helper method to remove the header information from a Project Gutenberg file. The method will continue to consume the buffer of the Scanner until the header text has been removed, then will stop.
      Parameters:
      read - Scanner using a Project Gutenberg text file as input
    • addWords

      private static void addWords(List<BasicWord> words, Scanner read)
      A helper method that reads words from a text file one at a time and stores the normalized words in a List of BasicWords. Any word that contains only whitespace should be ignored.
      Parameters:
      words - the List used to store the words
      read - Scanner using a Project Gutenberg text file as input
    • normalize

      private static String normalize(String s)
      A helper method that removes all punctuation from a String and converts the resulting punctuation-less String to lowercase
      Parameters:
      s - the String to normalize
      Returns:
      the normalized String
    • addBigrams

      private static void addBigrams(List<Word> bigrams, List<BasicWord> words)
      A helper method that generates Bigrams from the ordered List of BasicWords and stores the Bigrams in a List. There should only be one instance of each Bigram in the List. When successive copies of the same Bigram are found, the location should be added to the existing Bigram and the occurrence count should be incremented.
      Parameters:
      bigrams - the List in which to store the resulting Bigrams
      words - the ordered List of BasicWord to use to generate the Bigrams
    • addVocabulary

      private static void addVocabulary(List<Word> vocabulary, List<BasicWord> words)
      A helper method that generates VocabularyEntry objects from the ordered List of BasicWords and stores the entries in a List. There should only be one instance of each VocabularyEntry in the List. When successive copies of the same entry are found, the location should be added to the existing entry and the occurrence count should be incremented.
      Parameters:
      vocabulary - the List in which to store the resulting VocabularyEntry objects
      words - the ordered List of BasicWord to use to generate the vocabulary
    • saveFile

      private static void saveFile(List<Word> list, File output) throws FileNotFoundException
      A helper method to save a List of Words as a text file
      Parameters:
      list - the List of Word objects to save
      output - the File to save the data into
      Throws:
      FileNotFoundException - thrown if the File cannot be found
    • report

      private static void report(List<Word> list, String type, int topHits)
      A helper method that generates a report of the most frequent entries from the given sorted List. If topHits is greater than the total number of entries in the list, it will print out the entire list
      Parameters:
      list - the List from which to generate the report
      type - a String describing what the contents of the list are (i.e. "Words", "Bigrams", etc)
      topHits - the number of items to display in the report