Java中的 ParseContextClass
ParseContext类是Java包 org.apache.tika.parser 的一个组件,用于解析上下文并将其传递给Tika (Apache Tika 工具包检测并从一千多种不同的文件类型中提取元数据和文本)解析器org.apache.tika.parser.ParseContext实现了一个 Serializable 接口。
public class ParseContext extends Object implements Serializable
构造函数:
1. ParseContext() – ParseContext() 初始化 ParseContext 类的一个新实例。
ParseContext p = new ParseContext()
Note: p is the new instance of ParseContext class.
ParseContext 的方法——
S.No. | Method | Description | Return Type |
---|---|---|---|
1. | getDocumentBuilder() | getDocumentBuilder() returns the DOM builder specified in this parsing context. | DocumentBuilder |
2. | getSAXParser() | getSAXParser() returns the SAX parser specified in this parsing context. | SAXParser |
3. | getSAXParserFactory() | getSAXParserFactory() returns the SAX parser factory specified in this parsing context. | SAXParserFactory |
4. | getTransformer() | getTransformer() returns the transformer specified in this parsing context. | Transformer |
5. | getXMLInputFactory() | getXMLInputFactory() returns the StAX input factory specified in this parsing context. | XMLInputFactory |
6. | getXMLReader() | getXMLReader() returns the XMLReader specified in this parsing context. | XMLReader |
7. | get(Class | get(Class | |
8. | get(Class | get(Class | |
9. | set(Class | set(Class |
例子:
Java
// Java Program To Get Content of the
// document using Tika Toolkit and
// ContextParser:
import java.io.*;
// importing File class
import java.io.File.*;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.txt.TXTParser;
import org.apache.tika.sax.BodyContentHandler;
// import the necessary Tika packages
import org.xml.sax.SAXException;
class GFG {
public static void main(String[] args)
{
// new instance of FIle is created
File fileName = new File("tmp.txt");
// new instance of FileInputStream is created for
// reading purpose
FileInputStream fileInputStream
= new FileInputStream(fileName);
// new instance of parseContext class is created
ParseContext parseContext = new ParseContext();
// new instance of MetaData is created
MetaData metaData = new MetaData();
// new instance of TXTParser is created for plain
// text parsing purpose
TXTParser textParser = new TXTParser();
// new instance of BodyContentHandler is created
BodyContentHandler bodyContentHandler
= new BodyContentHandler();
// TXTParser parse method is called for parsing a
// document stream into sequence of XHTML SAX events.
textParser.parse(fileInputStream,
bodyContentHandler, metaData,
parseContext);
System.out.println("Contents of the document:"
+ bodyContenthandler.toString());
}
}
输出-
Contents of the document:GFG is the best website for programmer
注 – tmp.txt文件包含以下数据。