cluster
Class Phrase

java.lang.Object
  extended by cluster.Phrase
All Implemented Interfaces:
java.lang.Iterable<Term>

public class Phrase
extends java.lang.Object
implements java.lang.Iterable<Term>

A key component of the unsupervised clustering. The phrases hold the primary phrase that labels the cluster and alternative phrases. In addition, the phrases hold the document that belong to the cluster. Finally, each of these Phrases acts like a tree node by pointing to a parent node (null if doesn't exist) and to all children.

Author:
davidc

Constructor Summary
Phrase()
           
Phrase(java.util.List<Term> combinedOverlap)
          Constructor based on just the terms of the primary phrase
Phrase(Phrase t)
          Copy constructor
Phrase(java.lang.String terms)
           
 
Method Summary
 boolean add(Term arg0)
           
 void addAlternativePhrase(java.util.List<Term> ts)
          Add an alternative phrase, which is sufficient to containing all phrases held by this Phrase
 void addChild(Phrase t)
           
 void addCoveredDoc(ClusterDoc d)
          cover is a Set anyways, so don't worry about checking before calling this method
 void addCoveredDocs(java.util.Set<ClusterDoc> cover2)
           
 void assignId(int i)
           
 void clear()
           
 void clearChildren()
           
 void clearCover()
           
static Phrase CombinedFactory(Phrase a, Phrase b)
           
 int coverSize()
           
 void deleteSelf()
          Has no effect if this termset has no parent
 Term get(int arg0)
           
 java.util.List<Term> getCombinedOverlap(Phrase set2)
          Assuming this set is in front
 java.lang.String getCondensedString()
          Get all fixed forms of the primary phrase concatenated together
 java.util.Set<ClusterDoc> getCover()
          Returns those documents that belong to this phrase by themself
If a paper contains a child phrase but not this phrase, then this method will not return the paper although getNestedCover() will
 int getCoverSize()
           
 java.util.Set<ClusterDoc> getCoverUnion(Phrase s)
           
 int getId()
           
 java.lang.String getLabel()
           
 int getMaxIndex()
           
 java.util.Set<ClusterDoc> getNestedCover()
          All documents that support/contain this term or child terms
 java.lang.Integer getNumChildren()
          Recursively computes number of child nodes
 int getNumTermsShared(Phrase t)
           
 Phrase getParent()
           
 java.util.List<java.util.List<Term>> getPhrases()
          Gets all list of terms that describe this cluster
Containing any of the list of terms is equivalent to containing this Phrase (ie.
 Term getTerm(int j)
           
 java.util.List<Term> getTerms()
           
 java.util.List<Phrase> getTermSetChildren()
           
 java.util.Iterator<Term> iterator()
           
 int numAlternativePhrases()
           
 boolean overLapTerms(Phrase set)
           
 void removeChild(Phrase toMove)
          toMove's parent is set to null, so assign a new parent after calling this function
 void removeCoveredDoc(TestDoc d)
           
 void setParent(Phrase parent)
           
 void setTerms(java.util.List<Term> terms)
           
 int size()
           
 java.lang.String toString()
          Returns the primary phrase (the label to describe this cluster)
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Phrase

public Phrase()

Phrase

public Phrase(Phrase t)
Copy constructor

Parameters:
t -

Phrase

public Phrase(java.util.List<Term> combinedOverlap)
Constructor based on just the terms of the primary phrase

Parameters:
combinedOverlap -

Phrase

public Phrase(java.lang.String terms)
Method Detail

getCondensedString

public java.lang.String getCondensedString()
Get all fixed forms of the primary phrase concatenated together

Returns:

getPhrases

public java.util.List<java.util.List<Term>> getPhrases()
Gets all list of terms that describe this cluster
Containing any of the list of terms is equivalent to containing this Phrase (ie. containing one should be considered as containing all)

Returns:

addAlternativePhrase

public void addAlternativePhrase(java.util.List<Term> ts)
Add an alternative phrase, which is sufficient to containing all phrases held by this Phrase

Parameters:
ts -

addChild

public void addChild(Phrase t)

getTermSetChildren

public java.util.List<Phrase> getTermSetChildren()

add

public boolean add(Term arg0)

get

public Term get(int arg0)

size

public int size()

iterator

public java.util.Iterator<Term> iterator()
Specified by:
iterator in interface java.lang.Iterable<Term>

toString

public java.lang.String toString()
Returns the primary phrase (the label to describe this cluster)

Overrides:
toString in class java.lang.Object

getMaxIndex

public int getMaxIndex()

addCoveredDoc

public void addCoveredDoc(ClusterDoc d)
cover is a Set anyways, so don't worry about checking before calling this method

Parameters:
d -

getNestedCover

public java.util.Set<ClusterDoc> getNestedCover()
All documents that support/contain this term or child terms

Returns:

getTerms

public java.util.List<Term> getTerms()

CombinedFactory

public static Phrase CombinedFactory(Phrase a,
                                     Phrase b)

setTerms

public void setTerms(java.util.List<Term> terms)

coverSize

public int coverSize()

getNumTermsShared

public int getNumTermsShared(Phrase t)

getCoverUnion

public java.util.Set<ClusterDoc> getCoverUnion(Phrase s)

overLapTerms

public boolean overLapTerms(Phrase set)

getCombinedOverlap

public java.util.List<Term> getCombinedOverlap(Phrase set2)
Assuming this set is in front

Parameters:
set2 -
Returns:

getTerm

public Term getTerm(int j)

addCoveredDocs

public void addCoveredDocs(java.util.Set<ClusterDoc> cover2)

clear

public void clear()

assignId

public void assignId(int i)

getId

public int getId()

getParent

public Phrase getParent()

setParent

public void setParent(Phrase parent)

clearChildren

public void clearChildren()

removeChild

public void removeChild(Phrase toMove)
toMove's parent is set to null, so assign a new parent after calling this function

Parameters:
toMove -

removeCoveredDoc

public void removeCoveredDoc(TestDoc d)

clearCover

public void clearCover()

getCover

public java.util.Set<ClusterDoc> getCover()
Returns those documents that belong to this phrase by themself
If a paper contains a child phrase but not this phrase, then this method will not return the paper although getNestedCover() will

Returns:

deleteSelf

public void deleteSelf()
Has no effect if this termset has no parent


numAlternativePhrases

public int numAlternativePhrases()

getCoverSize

public int getCoverSize()

getLabel

public java.lang.String getLabel()

getNumChildren

public java.lang.Integer getNumChildren()
Recursively computes number of child nodes

Returns: