gate.corpora
Class RepositioningInfo
java.lang.Object
|
+--java.util.AbstractCollection
|
+--java.util.AbstractList
|
+--java.util.ArrayList
|
+--gate.corpora.RepositioningInfo
- All Implemented Interfaces:
- Cloneable, Collection, List, Serializable
- public class RepositioningInfo
- extends ArrayList
RepositioningInfo keep information about correspondence of positions
between the original and extracted document content. With this information
this class could be used for computing of this correspondence in the strict
way (return -1 where is no correspondence)
or in "flow" way (return near computable position)
- See Also:
- Serialized Form
Field Summary |
(package private) static long |
serialVersionUID
Freeze the serialization UID. |
Method Summary |
void |
addPositionInfo(long origPos,
long origLength,
long currPos,
long currLength)
Create a new position information record. |
void |
correctInformation(long originalPos,
long origLen,
long newLen)
Correct the RepositioningInfo structure for shrink/expand changes. |
void |
correctInformationOriginalMove(long originalPos,
long moveLen)
Correct the original position information in the records. |
long |
getExtractedPos(long absPos)
Compute position in extracted content by position in the original content. |
long |
getExtractedPosFlow(long absPos)
Not finished yet |
int |
getIndexByOriginalPosition(long absPos)
Return the position info index containing @param absPos
If there is no such position info return -1. |
int |
getIndexByOriginalPositionFlow(long absPos)
Return the position info index containing @param absPos
or the index of record before this position. |
long |
getOriginalPos(long relPos)
|
long |
getOriginalPos(long relPos,
boolean afterChar)
Compute position in original content by position in the extracted content. |
long |
getOriginalPosFlow(long relPos)
Not finished yet |
Methods inherited from class java.util.ArrayList |
add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, RangeCheck, readObject, remove, removeRange, set, size, toArray, toArray, trimToSize, writeObject |
serialVersionUID
static final long serialVersionUID
- Freeze the serialization UID.
RepositioningInfo
public RepositioningInfo()
- Default constructor
addPositionInfo
public void addPositionInfo(long origPos,
long origLength,
long currPos,
long currLength)
- Create a new position information record.
getExtractedPos
public long getExtractedPos(long absPos)
- Compute position in extracted content by position in the original content.
If there is no correspondence return -1.
getOriginalPos
public long getOriginalPos(long relPos)
getOriginalPos
public long getOriginalPos(long relPos,
boolean afterChar)
- Compute position in original content by position in the extracted content.
If there is no correspondence return -1.
getExtractedPosFlow
public long getExtractedPosFlow(long absPos)
- Not finished yet
getOriginalPosFlow
public long getOriginalPosFlow(long relPos)
- Not finished yet
getIndexByOriginalPosition
public int getIndexByOriginalPosition(long absPos)
- Return the position info index containing @param absPos
If there is no such position info return -1.
getIndexByOriginalPositionFlow
public int getIndexByOriginalPositionFlow(long absPos)
- Return the position info index containing @param absPos
or the index of record before this position.
Result is -1 if the position is before the first record.
Rezult is size() if the position is after the last record.
correctInformation
public void correctInformation(long originalPos,
long origLen,
long newLen)
- Correct the RepositioningInfo structure for shrink/expand changes.
Normaly the text peaces have same sizes in both original text and
extracted text. But in some cases there are nonlinear substitutions.
For example the sequence "<" is converted to "<".
The correction will split the corresponding PositionInfo structure to
3 new records - before correction, correction record and after correction.
Front and end records are the same maner like the original record -
m_origLength == m_currLength, since the middle record has different
values because of shrink/expand changes. All records after this middle
record should be corrected with the difference between these values.
All m_currPos above the current information record should be corrected
with (origLen - newLen) i.e.
m_currPos -= origLen - newLen;
- Parameters:
originalPos
- Position of changed text in the original content.origLen
- Length of changed peace of text in the original content.newLen
- Length of new peace of text substiting the original peace.
correctInformationOriginalMove
public void correctInformationOriginalMove(long originalPos,
long moveLen)
- Correct the original position information in the records. When some text
is shrinked/expanded by the parser. With this method is corrected the
substitution of "\r\n" with "\n".