|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectwebsearch.SearchResultDoc
public class SearchResultDoc
A document that implements the methods needed to be clustered
Also includes some functionality for cleaning up the results from the
snippets of the search engines
| Constructor Summary | |
|---|---|
SearchResultDoc(java.lang.String title,
java.lang.String snippet,
java.lang.String url,
VectorManager vm)
|
|
| Method Summary | |
|---|---|
void |
addTermSetCount(Phrase termSet,
int n)
This document should record how frequently this termSet occured |
void |
checkSourceExists()
Throw an exception if this file won't be cluster-able |
static java.lang.String |
clean(java.lang.String s)
|
int |
compareTo(ClusterDoc arg0)
Assuming that arg0 is a SearchResultDoc |
void |
destroyLocalDoc()
After finding how often all the phrases are in this doc, this method should allow the supporting document to be released to free up memory. |
java.lang.String |
getDocId()
|
java.lang.String[][] |
getFixedWordSentences()
Each String should be fixed by VectorManager before being returned |
int[][] |
getIdxSentences(VectorManager vm)
Each entry represents the integer. |
int |
getNumInstancesOfTermSet(Phrase s)
Each document should get a unique id |
java.lang.String[][] |
getSentences()
|
java.lang.String |
getSnippet()
|
java.lang.String |
getSrc()
|
double |
getTermSetsSupported()
|
java.lang.String |
getTitle()
|
java.lang.String |
getUrl()
|
boolean |
isJunkPhrase(java.lang.String phrase)
Added to allow differentiation between phrases of scientific articles and general search results |
void |
loadWindowedDoc()
Initially, the idea was to support proximity windows (eg. |
void |
setUrl(java.lang.String url)
|
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public SearchResultDoc(java.lang.String title,
java.lang.String snippet,
java.lang.String url,
VectorManager vm)
| Method Detail |
|---|
public void addTermSetCount(Phrase termSet,
int n)
ClusterDoc
addTermSetCount in interface ClusterDocpublic int getNumInstancesOfTermSet(Phrase s)
ClusterDoc
getNumInstancesOfTermSet in interface ClusterDocpublic double getTermSetsSupported()
getTermSetsSupported in interface ClusterDoc
public void checkSourceExists()
throws java.io.FileNotFoundException
ClusterDoc
checkSourceExists in interface ClusterDocjava.io.FileNotFoundException
public java.lang.String[][] getFixedWordSentences()
throws java.io.FileNotFoundException,
java.io.IOException
ClusterDoc
getFixedWordSentences in interface ClusterDocjava.io.FileNotFoundException
java.io.IOExceptionpublic int[][] getIdxSentences(VectorManager vm)
ClusterDoc
getIdxSentences in interface ClusterDoc
public java.lang.String[][] getSentences()
throws java.io.FileNotFoundException,
java.io.IOException
getSentences in interface ClusterDocjava.io.FileNotFoundException
java.io.IOExceptionpublic void loadWindowedDoc()
ClusterDoc
loadWindowedDoc in interface ClusterDocpublic void destroyLocalDoc()
ClusterDoc
destroyLocalDoc in interface ClusterDocpublic int compareTo(ClusterDoc arg0)
compareTo in interface java.lang.Comparable<ClusterDoc>arg0 -
public static java.lang.String clean(java.lang.String s)
public java.lang.String getDocId()
public boolean isJunkPhrase(java.lang.String phrase)
ClusterDoc
isJunkPhrase in interface ClusterDocphrase - Space-separated words
public java.lang.String getSrc()
public java.lang.String getUrl()
public void setUrl(java.lang.String url)
public java.lang.String getSnippet()
public java.lang.String getTitle()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||