TextExtractor

TextExtractor

Overview

TextExtractor is a class in Aspose.Pdf FOSS for Java.

Extracts text from PDF page content streams by processing text operators (ISO 32000-1:2008, §9.4).

Methods

SignatureDescription
TextExtractor(parser: PDFParser)Creates a TextExtractor.
extract(page: Page)List<TextFragment>Extracts all text fragments from a page.
extractText(page: Page)StringExtracts all text from a page as a plain string.

See Also