📜  TIKA-引用的API

📅  最后修改于: 2020-11-10 04:26:25             🧑  作者: Mango


用户可以使用Tika门面类将Tika嵌入其应用程序中。它具有探索Tika的所有功能的方法。由于是外观类,因此Tika提取了其功能背后的复杂性。除此之外,用户还可以在其应用程序中使用各种类别的Tika。

用户申请

提卡(立面)

这是提卡图书馆中最杰出的一类,遵循立面设计模式。因此,它抽象了所有内部实现,并提供了访问Tika功能的简单方法。下表列出了此类的构造函数及其描述。

-org.apache.tika

-蒂卡

Sr.No. Constructor & Description
1

Tika ()

Uses default configuration and constructs the Tika class.

2

Tika (Detector detector)

Creates a Tika facade by accepting the detector instance as parameter

3

Tika (Detector detector, Parser parser)

Creates a Tika facade by accepting the detector and parser instances as parameters.

4

Tika (Detector detector, Parser parser, Translator translator)

Creates a Tika facade by accepting the detector, the parser, and the translator instance as parameters.

5

Tika (TikaConfig config)

Creates a Tika facade by accepting the object of the TikaConfig class as parameter.

方法与说明

以下是Tika门面类的重要方法-

Sr.No. Methods & Description
1

parseToString (File file)

This method and all its variants parses the file passed as parameter and returns the extracted text content in the String format. By default, the length of this string parameter is limited.

2

int getMaxStringLength ()

Returns the maximum length of strings returned by the parseToString methods.

3

void setMaxStringLength (int maxStringLength)

Sets the maximum length of strings returned by the parseToString methods.

4

Reader parse (File file)

This method and all its variants parses the file passed as parameter and returns the extracted text content in the form of java.io.reader object.

5

String detect (InputStream stream, Metadata metadata)

This method and all its variants accepts an InputStream object and a Metadata object as parameters, detects the type of the given document, and returns the document type name as String object. This method abstracts the detection mechanisms used by Tika.

6

String translate (InputStream text, String targetLanguage)

This method and all its variants accepts the InputStream object and a String representing the language that we want our text to be translated, and translates the given text to the desired language, attempting to auto-detect the source language.

解析器接口

这是由Tika包的所有解析器类实现的接口。

-org.apache.tika.parser

接口-解析器

方法与说明

以下是Tika Parser界面的重要方法-

Sr.No. Methods & Description
1

parse (InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context)

This method parses the given document into a sequence of XHTML and SAX events. After parsing, it places the extracted document content in the object of the ContentHandler class and the metadata in the object of the Metadata class.

元数据类

此类实现各种接口,例如CreativeCommons,Geographic,HttpHeaders,Message,MSOffice,ClimateForcast,TIFF,TikaMetadataKeys,TikaMimeKeys,可序列化以支持各种数据模型。下表列出了此类的构造函数和方法及其说明。

-org.apache.tika.metadata

-元数据

Sr.No. Constructor & Description
1

Metadata()

Constructs a new, empty metadata.

Sr.No. Methods & Description
1

add (Property property, String value)

Adds a metadata property/value mapping to a given document. Using this function, we can set the value to a property.

2

add (String name, String value)

Adds a metadata property/value mapping to a given document. Using this method, we can set a new name value to the existing metadata of a document.

3

String get (Property property)

Returns the value (if any) of the metadata property given.

4

String get (String name)

Returns the value (if any) of the metadata name given.

5

Date getDate (Property property)

Returns the value of Date metadata property.

6

String[] getValues (Property property)

Returns all the values of a metadata property.

7

String[] getValues (String name)

Returns all the values of a given metadata name.

8

String[] names()

Returns all the names of metadata elements in a metadata object.

9

set (Property property, Date date)

Sets the date value of the given metadata property

10

set(Property property, String[] values)

Sets multiple values to a metadata property.

语言标识符类

此类标识给定内容的语言。下表列出了此类的构造函数及其描述。

-org.apache.tika.language

类别-语言标识符

Sr.No. Constructor & Description
1

LanguageIdentifier (LanguageProfile profile)

Instantiates the language identifier. Here you have to pass a LanguageProfile object as parameter.

2

LanguageIdentifier (String content)

This constructor can instantiate a language identifier by passing on a String from text content.

Sr.No. Methods & Description
1

String getLanguage ()

Returns the language given to the current LanguageIdentifier object.