数据挖掘:
数据挖掘是从大数据集中寻找模式和提取有用数据的过程。它用于将原始数据转换为有用的数据。数据挖掘对于改进公司的营销策略非常有用,因为在结构化数据的帮助下,我们可以研究来自不同数据库的数据,然后获得更多创新想法以提高组织的生产力。文本挖掘只是数据挖掘的一部分。
文本挖掘:
文本挖掘基本上是一种人工智能技术,涉及处理来自各种文本文档的数据。许多深度学习算法用于对文本进行有效评估。在文本挖掘中,数据以非结构化格式存储。它主要使用语言学原理来评估文档中的文本。
下表列出了数据挖掘和文本挖掘之间的差异:
S.No. | Data Mining | Text Mining |
---|---|---|
1. | Data mining is the statistical technique of processing raw data in a structured form. | Text mining is the part of data mining which involves processing of text from documents. |
2. | Pre-existing databases and spreadsheets are used to gather information. | The text is used to gather high quality information. |
3. | Processing of data is done directly. | Processing of data is done linguistically. |
4. | Stastical techniques are used to evaluate data. | Computational linguistic principles are used to evaluate text. |
5. | In data mining data is stored in structured format. | In text mining data is stored in unstructured format. |
6. | Data is homogeneous and is easy to retrieve. | Data is heterogeneous and is not so easy to retrieve. |
7. | It supports mining of mixed data. | In text mining, mining of text is only done. |
8. | It combines artificial intelligence, machine learning and statistics and applies it on data. | It applies pattern recognizing and natural language processing to unstructured data. |
9. | It is used in fields like marketing, medicine, healthcare. | It is used in fields like bioscience and customer profile analysis. |