1.本地文件系统(LFS):
Linux操作系统的基本文件系统称为本地文件系统。它以单个副本的形式存储任何数据文件。
它以树格式存储数据文件。在这里,任何用户都可以直接访问数据文件。 LFS不复制数据块。它总是用于存储和处理个人数据(小数据)。
2.分布式文件系统(DFS):
当我们需要存储和处理大数据文件(至少大约1 TB大小的文件)时,不适合使用“操作系统”的“本地”文件系统。在这种情况下,我们使用分布式文件系统。可以在具有Hadoop的任何Linux操作系统上创建它。 DFS通过将任何数据文件分为几个块来存储它。
该文件系统以主从格式工作,其中主节点是NameNode,数据节点是从节点。数据文件的所有块都存储在不同的DataNodes中,并且该位置仅由NameNode知道。每个数据块都复制到不同的数据节点中,以避免任何数据节点发生故障时数据丢失。在DFS中,任何用户都无法直接访问数据文件,因为只有NameNode知道数据文件的数据块的存储位置。
本地文件系统(LFS)和分布式文件系统(DFS)之间的区别:
Local File System | Distributed File System |
---|---|
LFS stores data as a single block. | DFS divides data as multiple blocks and stores it into different DataNodes. |
LFS uses Tree format to store Data. | DFS provides Master-Slave architecture for Data storage. |
Data retrieval in LFS is slow. | Data retrieval in DFS is fast. |
It is not reliable because LFS data does not replicate the Data files. | It is reliable because in DFS data blocks are replicated into different DataNodes. |
LFS is cheaper because it does not needs extra memory for storing any data file. | DFS is expensive because it needs extra memory to replicate the same data blocks. |
Files can be accessed directly in LFS. | Files can not be accessed directly in DFS because the actual location of data blocks are only known by NameNode. |
LFS is not appropriate for analysis of very big file of data because it needs large time to process. | DFS is appropriate for analysis of big file of data because it needs less amount of time to process as compare to Local file system. |
LFS is less complex than DFS. | DFS is more complex than LFS. |