ParsedText (openpdf 1.2.7 API)

java.lang.Object
- com.lowagie.text.pdf.parser.ParsedTextImpl
- - com.lowagie.text.pdf.parser.ParsedText

All Implemented Interfaces:

TextAssemblyBuffer
```
public class ParsedText
extends ParsedTextImpl
```
Author:

dgd

Field Summary

Fields
Modifier and Type	Field and Description
`protected GraphicsState`	`gs`
`protected PdfString`	`pdfText` retain original PdfString as we need to distinguish between the code points contained there, and the stadnard Java (Unicode strings) that actually represent the content of this text.
`protected Matrix`	`textToUserSpaceTransformMatrix`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`accumulate(TextAssembler p, String contextName)` We pass ourselves to the assembler, which is a visitor, so that it can accumulate information on this text depending on its type.
`void`	`assemble(TextAssembler p)`
`boolean`	`breakBefore()`
`protected String`	`decode(PdfString in)` This constructor should only be called when the origin for text display is at (0,0) and the graphical state reflects all transformations of the baseline.
`protected String`	`decode(String in)` Decodes a Java String containing glyph ids encoded in the font's encoding, and determine the unicode equivalent
`List<Word>`	`getAsPartialWords()` Break this string if there are spaces within it.
`FinalText`	`getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup)`
`String`	`getFontCodes()`
`String`	`getText()` when returning the text from this item, we need to decode the code points we have.
`float`	`getUnscaledTextWidth(GraphicsState gs)`
`boolean`	`shouldNotSplit()`
`String`	`toString()`

Methods inherited from class com.lowagie.text.pdf.parser.ParsedTextImpl
getAscent, getBaseline, getDescent, getEndPoint, getSingleSpaceWidth, getStartPoint, getWidth

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - textToUserSpaceTransformMatrix
```
protected final Matrix textToUserSpaceTransformMatrix
```
  - gs
```
protected final GraphicsState gs
```
  - pdfText
```
protected PdfString pdfText
```
    retain original PdfString as we need to distinguish between the code points contained there, and the stadnard Java (Unicode strings) that actually represent the content of this text.
- Method Detail
  - decode
```
protected String decode(String in)
```
    Decodes a Java String containing glyph ids encoded in the font's encoding, and determine the unicode equivalent
    
    Parameters:
    
    in - the String that needs to be decoded
    
    Returns:
    
    the decoded String
  - decode
```
protected String decode(PdfString in)
```
    This constructor should only be called when the origin for text display is at (0,0) and the graphical state reflects all transformations of the baseline. This is in text space units. Decodes a PdfString (which will contain glyph ids encoded in the font's encoding) based on the active font, and determine the unicode equivalent
    
    Parameters:
    
    in - the String that needs to be encoded
    
    Returns:
    
    the encoded String
    
    Since:
    
    2.1.7
  - getAsPartialWords
```
public List<Word> getAsPartialWords()
```
    Break this string if there are spaces within it. If so, we mark the new Words appropriately for later assembly. We are guaranteed that every space (internal word break) in this parsed text object will create a new word in the result of this method. We are not guaranteed that these Word objects are actually words until they have been assembled. The word following any space preserves that space in its string value, so that the assembler will not erroneously merge words that should be separate, regardless of the spacing.
    
    Returns:
    
    list of Word objects.
  - getUnscaledTextWidth
```
public float getUnscaledTextWidth(GraphicsState gs)
```
    Parameters:
    
    gs - graphic state including current transformation to page coordinates from text measurement
    
    Returns:
    
    the unscaled (i.e. in Text space) width of our text
  - accumulate
```
public void accumulate(TextAssembler p,
                       String contextName)
```
    Description copied from interface: TextAssemblyBuffer
    
    We pass ourselves to the assembler, which is a visitor, so that it can accumulate information on this text depending on its type. The result is calculated by a final "assembly" phase, after accumulation is done. This is because we may have non-contiguous items in a PDF text stream.
    
    Parameters:
    
    p - the assembler that is visiting us.
    
    contextName - Name of the surrounding markup element/"context" if we're generating tagged output.
    
    See Also:
    
    TextAssemblyBuffer.accumulate(com.lowagie.text.pdf.parser.TextAssembler, String)
  - assemble
```
public void assemble(TextAssembler p)
```
    Parameters:
    
    p - we may pass ourselves to this assembler again during the final assembly process.
    
    See Also:
    
    TextAssemblyBuffer.assemble(com.lowagie.text.pdf.parser.TextAssembler)
  - getText
```
public String getText()
```
    when returning the text from this item, we need to decode the code points we have.
    
    Specified by:
    
    getText in interface TextAssemblyBuffer
    
    Overrides:
    
    getText in class ParsedTextImpl
    
    Returns:
    
    the text to render
    
    See Also:
    
    ParsedTextImpl.getText()
  - getFontCodes
```
public String getFontCodes()
```
    Returns:
    
    a string whose characters represent code points in a possibly two-byte font
  - getFinalText
```
public FinalText getFinalText(PdfReader reader,
                              int page,
                              TextAssembler assembler,
                              boolean useMarkup)
```
    Parameters:
    
    reader - pdfReader that knows about our document. (size, etc. available here).
    
    page - which page are we extracting text from.
    
    assembler - Builds result by accepting content from text components of various sorts.
    
    useMarkup - Should we generate tagged text, or just plain text.
    
    Returns:
    
    the final text ready to concatenate into result string.
    
    See Also:
    
    TextAssemblyBuffer.getFinalText(com.lowagie.text.pdf.PdfReader, int, com.lowagie.text.pdf.parser.TextAssembler, boolean)
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object
    
    See Also:
    
    Object.toString()
  - shouldNotSplit
```
public boolean shouldNotSplit()
```
    Specified by:
    
    shouldNotSplit in class ParsedTextImpl
    
    Returns:
    
    true if this was extracted from a string containing spaces, in which case, we assume further splitting is not needed.
    
    See Also:
    
    ParsedTextImpl.shouldNotSplit()
  - breakBefore
```
public boolean breakBefore()
```
    Specified by:
    
    breakBefore in class ParsedTextImpl
    
    Returns:
    
    See Also:
    
    ParsedTextImpl.breakBefore()

Class ParsedText

Field Summary

Method Summary

Methods inherited from class com.lowagie.text.pdf.parser.ParsedTextImpl

Methods inherited from class java.lang.Object

Field Detail

textToUserSpaceTransformMatrix

gs

pdfText

Method Detail

decode

decode

getAsPartialWords

getUnscaledTextWidth

accumulate

assemble

getText

getFontCodes

getFinalText

toString

shouldNotSplit

breakBefore