|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.ObjectUtil.TestDoc
public class TestDoc
Represents documents that we wish to classify or cluster.
For classification:
| Constructor Summary | |
|---|---|
TestDoc(java.lang.String wbid)
|
|
| Method Summary | |
|---|---|
void |
addTermSetCount(Phrase termSet,
int n)
This document should record how frequently this termSet occured |
void |
checkSourceExists()
Throw an exception if this file won't be cluster-able |
int |
compareTo(ClusterDoc arg0)
|
int |
compareTo(TestDoc arg0)
|
void |
destroyLocalDoc()
After finding how often all the phrases are in this doc, this method should allow the supporting document to be released to free up memory. |
java.lang.StringBuffer |
getAbstract()
|
java.lang.String |
getDocId()
|
double |
getExactMembership(int cat)
|
java.lang.String[][] |
getFixedWordSentences()
Each String should be fixed by VectorManager before being returned |
int[][] |
getIdxSentences(VectorManager vm)
Each entry represents the integer. |
int |
getNumInstancesOfTermSet(Phrase s)
Each document should get a unique id |
java.lang.String[][] |
getSentences()
|
double |
getTermSetsSupported()
|
java.lang.String |
getTitle()
|
java.lang.String |
getWbid()
|
java.util.List<java.lang.String>[] |
getWindows()
|
XMLDoc |
getXMLDoc()
|
boolean |
isJunkPhrase(java.lang.String phrase)
Added to allow differentiation between phrases of scientific articles and general search results |
boolean |
isMemberOf(int cat)
|
void |
loadWindowedDoc()
Initially, the idea was to support proximity windows (eg. |
void |
readyDoc(VectorManager titlevm,
VectorManager vm,
java.lang.String src)
|
void |
readyDocLocally(VectorManager titleVM,
VectorManager articleVm,
java.lang.String src)
|
void |
readyTitleOnly(VectorManager titleVM)
|
void |
registerMembership(int cat,
double val,
int source)
Source was added to allow classification by title and article simultaneously and be able to weight them at the end. |
void |
setWbid(java.lang.String wbid)
|
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public TestDoc(java.lang.String wbid)
| Method Detail |
|---|
public void registerMembership(int cat,
double val,
int source)
registerMembership in interface SVMTestablecat - val - source - When guaranteeing that a paper belongs to at least one
category, this determination is done for results from source 0
(ie, this functionality can be determined by never setting source
to 0)public double getExactMembership(int cat)
public boolean isJunkPhrase(java.lang.String phrase)
ClusterDoc
isJunkPhrase in interface ClusterDocphrase - Space-separated words
public boolean isMemberOf(int cat)
public java.lang.String getWbid()
getWbid in interface SVMTestablegetWbid in interface DocIdentifiablepublic void setWbid(java.lang.String wbid)
public XMLDoc getXMLDoc()
public java.lang.StringBuffer getAbstract()
public java.lang.String getTitle()
public void readyDoc(VectorManager titlevm,
VectorManager vm,
java.lang.String src)
titlevm - vm - src - art or abspublic void readyTitleOnly(VectorManager titleVM)
public void readyDocLocally(VectorManager titleVM,
VectorManager articleVm,
java.lang.String src)
readyDocLocally in interface SVMTestablepublic void destroyLocalDoc()
ClusterDoc
destroyLocalDoc in interface ClusterDocpublic int compareTo(TestDoc arg0)
public void checkSourceExists()
throws java.io.FileNotFoundException
ClusterDoc
checkSourceExists in interface ClusterDocjava.io.FileNotFoundException
public java.lang.String[][] getFixedWordSentences()
throws java.io.FileNotFoundException,
java.io.IOException
ClusterDoc
getFixedWordSentences in interface ClusterDocjava.io.FileNotFoundException
java.io.IOExceptionpublic int[][] getIdxSentences(VectorManager vm)
ClusterDoc
getIdxSentences in interface ClusterDoc
public java.lang.String[][] getSentences()
throws java.io.FileNotFoundException,
java.io.IOException
getSentences in interface ClusterDocjava.io.FileNotFoundException
java.io.IOException
public java.util.List<java.lang.String>[] getWindows()
throws java.io.FileNotFoundException,
java.io.IOException
java.io.FileNotFoundException
java.io.IOExceptionpublic void loadWindowedDoc()
ClusterDoc
loadWindowedDoc in interface ClusterDocpublic int compareTo(ClusterDoc arg0)
compareTo in interface java.lang.Comparable<ClusterDoc>
public void addTermSetCount(Phrase termSet,
int n)
ClusterDoc
addTermSetCount in interface ClusterDocpublic int getNumInstancesOfTermSet(Phrase s)
ClusterDoc
getNumInstancesOfTermSet in interface ClusterDocpublic double getTermSetsSupported()
getTermSetsSupported in interface ClusterDocpublic java.lang.String getDocId()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||