public class ParsedText extends ParsedTextImpl
Modifier and Type | Method and Description |
---|---|
void |
accumulate(TextAssembler textAssembler,
String contextName)
We pass ourselves to the assembler, which is a visitor, so that it can
accumulate information on this text depending on its type.
|
void |
assemble(TextAssembler textAssembler) |
boolean |
breakBefore() |
protected String |
decode(PdfString pdfString)
This constructor should only be called when the origin for text display is at (0,0) and the
graphical state reflects all transformations of the baseline.
|
protected String |
decode(String in)
Decodes a Java String containing glyph ids encoded in the font's encoding, and determine the
unicode equivalent
|
List<Word> |
getAsPartialWords()
Break this string if there are spaces within it.
|
FinalText |
getFinalText(PdfReader reader,
int page,
TextAssembler assembler,
boolean useMarkup) |
String |
getFontCodes() |
String |
getText()
when returning the text from this item, we need to decode the code points we have.
|
float |
getUnscaledTextWidth(GraphicsState gs) |
boolean |
shouldNotSplit() |
String |
toString() |
getAscent, getBaseline, getDescent, getEndPoint, getSingleSpaceWidth, getStartPoint, getWidth
protected String decode(String in)
in
- the String that needs to be decodedprotected String decode(PdfString pdfString)
Decodes a PdfString (which will contain glyph ids encoded in the font's encoding) based on the active font, and determine the unicode equivalent
pdfString
- the String that needs to be encodedpublic List<Word> getAsPartialWords()
We are guaranteed that every space (internal word break) in this parsed text object will create a new word in the result of this method. We are not guaranteed that these Word objects are actually words until they have been assembled.
The word following any space preserves that space in its string value, so that the assembler will not erroneously merge words that should be separate, regardless of the spacing.
public float getUnscaledTextWidth(GraphicsState gs)
gs
- graphic state including current transformation to page coordinates from text
measurementpublic void accumulate(TextAssembler textAssembler, String contextName)
textAssembler
- the assembler that is visiting us.contextName
- Name of the surrounding markup element/"context" if
we're generating tagged output.TextAssemblyBuffer.accumulate(com.lowagie.text.pdf.parser.TextAssembler, String)
public void assemble(TextAssembler textAssembler)
textAssembler
- we may pass ourselves to this assembler again during the final
assembly process.TextAssemblyBuffer.assemble(com.lowagie.text.pdf.parser.TextAssembler)
@Nullable public String getText()
getText
in interface TextAssemblyBuffer
getText
in class ParsedTextImpl
ParsedTextImpl.getText()
@Nonnull public String getFontCodes()
public FinalText getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup)
reader
- pdfReader that knows about our document. (size, etc. available
here).page
- which page are we extracting text from.assembler
- Builds result by accepting content from text components of
various sorts.useMarkup
- Should we generate tagged text, or just plain text.TextAssemblyBuffer.getFinalText(com.lowagie.text.pdf.PdfReader,
int, com.lowagie.text.pdf.parser.TextAssembler, boolean)
public String toString()
toString
in class Object
Object.toString()
public boolean shouldNotSplit()
shouldNotSplit
in class ParsedTextImpl
ParsedTextImpl.shouldNotSplit()
public boolean breakBefore()
breakBefore
in class ParsedTextImpl
ParsedTextImpl.breakBefore()
Copyright © 2020. All rights reserved.