Tools
Class SentenceToWordConverter
java.lang.Object
Tools.SentenceToWordConverter
public class SentenceToWordConverter
- extends java.lang.Object
After converting XML to sentences, this class will produce new files
representing the words. Each line contains a word, count, and original word.
The original word is there to allow displaying of the original form, since the
stemmed versions of some words are too un-natural. This feature of displaying
the original form is not really needed now, but was added when attempting
to use bisecting k-means (ie. the clusters would be labelled by these
original words for the top 5 or so dimensions)
- Author:
- davidc
|
Method Summary |
static void |
clearOutDir(java.lang.String src)
|
static void |
convertFile(java.lang.String fileID,
java.io.File inFile,
java.io.File outFile)
|
static void |
convertFromSrc(java.lang.String src)
Notice that all files should be in the intermediate stage
Thus, no extra effort is needed here for files from backupFolder |
static void |
main(java.lang.String[] args)
|
| Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SentenceToWordConverter
public SentenceToWordConverter()
main
public static void main(java.lang.String[] args)
throws PreprocessorException
- Parameters:
args -
- Throws:
java.io.IOException
PreprocessorException
convertFromSrc
public static void convertFromSrc(java.lang.String src)
throws PreprocessorException
- Notice that all files should be in the intermediate stage
Thus, no extra effort is needed here for files from backupFolder
- Parameters:
src -
- Throws:
PreprocessorException
clearOutDir
public static void clearOutDir(java.lang.String src)
convertFile
public static void convertFile(java.lang.String fileID,
java.io.File inFile,
java.io.File outFile)
throws java.io.FileNotFoundException,
java.io.IOException
- Throws:
java.io.FileNotFoundException
java.io.IOException