SVM
Class SVMManager

java.lang.Object
  extended by SVM.SVMManager

public class SVMManager
extends java.lang.Object

Lots of functions for conducting the SVM. The main function for use by the engine is classifyEverything, which takes care of most of the work.

Author:
davidc

Field Summary
 int BOOST_FACTOR
           
static double SVM_COST
           
static double SVM_GAMMA
           
static int SVM_KERNEL
           
 boolean USE_BOOST
           
static boolean useCustomCostGamma
           
 
Constructor Summary
SVMManager(java.util.List<? extends SVMTrainable> trainlist, java.util.List<? extends SVMTestable> testlist)
           
 
Method Summary
static void checkCatResults(java.util.List<KnownDoc> knowns)
           
static void checkCatResults(java.util.List<KnownDoc> knownsO, int cat)
           
static boolean checkUnique(java.util.List<KnownDoc> knowns)
          Use this after reading in the training file to make sure a file is not present twice (a costly mistake that gives unpredictable results)
static void classifyEverything()
          Classify all papers available from the preprocessed intermediate results
static void classifyEverything(java.util.List<? extends SVMTrainable> knowns)
          Classify all documents in a fashion that minimizes maximum memory usage.
This is done by first computing the SVM for all 9 categories, and then iterating on each document (doing all 9 categories per document before moving on to the next document).
static void dumpProblem(int cat, java.util.List<KnownDoc> knowns, java.lang.String f)
           
static void main(java.lang.String[] args)
           
static void parameterGridSearch(java.util.List<KnownDoc> knowns, int cat)
           
static void runActiveLearner(java.util.List<? extends SVMTrainable> knowns)
           
 java.util.List<? extends SVMTestable> runSVM()
          Pass in the data via constructor.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SVM_KERNEL

public static int SVM_KERNEL

SVM_GAMMA

public static double SVM_GAMMA

SVM_COST

public static double SVM_COST

useCustomCostGamma

public static boolean useCustomCostGamma

USE_BOOST

public boolean USE_BOOST

BOOST_FACTOR

public int BOOST_FACTOR
Constructor Detail

SVMManager

public SVMManager(java.util.List<? extends SVMTrainable> trainlist,
                  java.util.List<? extends SVMTestable> testlist)
Parameters:
trainlist -
testlist -
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

dumpProblem

public static void dumpProblem(int cat,
                               java.util.List<KnownDoc> knowns,
                               java.lang.String f)
                        throws java.io.IOException,
                               org.jdom.JDOMException
Throws:
java.io.IOException
org.jdom.JDOMException

classifyEverything

public static void classifyEverything()
                               throws SVMException
Classify all papers available from the preprocessed intermediate results

Throws:
SVMException

checkUnique

public static boolean checkUnique(java.util.List<KnownDoc> knowns)
Use this after reading in the training file to make sure a file is not present twice (a costly mistake that gives unpredictable results)

Parameters:
knowns -
Returns:
True if duplicate file exists in knowns

classifyEverything

public static void classifyEverything(java.util.List<? extends SVMTrainable> knowns)
                               throws java.io.FileNotFoundException
Classify all documents in a fashion that minimizes maximum memory usage.
This is done by first computing the SVM for all 9 categories, and then iterating on each document (doing all 9 categories per document before moving on to the next document).

This is less intuitive than iterating through each category, finding the model and classifying all documents.

Furthermore, this method ensures that I'm using inductive SVM (but the training set should be large enough that Transductive SVM would not help that much).

Parameters:
knowns -
Throws:
java.io.FileNotFoundException

checkCatResults

public static void checkCatResults(java.util.List<KnownDoc> knownsO,
                                   int cat)
                            throws org.jdom.JDOMException,
                                   java.io.IOException
Throws:
org.jdom.JDOMException
java.io.IOException

parameterGridSearch

public static void parameterGridSearch(java.util.List<KnownDoc> knowns,
                                       int cat)
                                throws org.jdom.JDOMException,
                                       java.io.IOException
Throws:
org.jdom.JDOMException
java.io.IOException

runSVM

public java.util.List<? extends SVMTestable> runSVM()
Pass in the data via constructor.


runActiveLearner

public static void runActiveLearner(java.util.List<? extends SVMTrainable> knowns)
                             throws java.io.FileNotFoundException
Throws:
java.io.FileNotFoundException

checkCatResults

public static void checkCatResults(java.util.List<KnownDoc> knowns)
                            throws org.jdom.JDOMException,
                                   java.io.IOException
Throws:
org.jdom.JDOMException
java.io.IOException