1.本地文件系统(LFS):
Linux 操作系统的基本文件系统称为本地文件系统。它将任何数据文件按原样存储在单个副本中。
它以树格式存储数据文件。在这里,任何用户都可以直接访问数据文件。 LFS 不会复制数据块。它始终用于存储和处理个人数据(小数据)。
2. 分布式文件系统(DFS):
当我们需要存储和处理一个大数据文件(至少大约 1 TB 大小的文件)时,操作系统的本地文件系统是不合适的。在这种情况下,我们使用分布式文件系统。它可以在任何带有 Hadoop 的 Linux 操作系统上创建。 DFS 通过将任何数据文件分成几个块来存储它。
该文件系统采用 Master-Slave 格式,其中 Master 是 NameNode,DataNode 是从属。一个Data文件的所有块都存储在不同的DataNode中,位置只有NameNode知道。每个数据块都被复制到不同的数据节点中,以避免在任何数据节点发生故障时丢失数据。在 DFS 中,任何用户都无法直接访问数据文件,因为只有 NameNode 知道数据文件的数据块存储在哪里。
本地文件系统 (LFS) 和分布式文件系统 (DFS) 的区别:
Local File System | Distributed File System |
---|---|
LFS stores data as a single block. | DFS divides data as multiple blocks and stores it into different DataNodes. |
LFS uses Tree format to store Data. | DFS provides Master-Slave architecture for Data storage. |
Data retrieval in LFS is slow. | Data retrieval in DFS is fast. |
It is not reliable because LFS data does not replicate the Data files. | It is reliable because in DFS data blocks are replicated into different DataNodes. |
LFS is cheaper because it does not needs extra memory for storing any data file. | DFS is expensive because it needs extra memory to replicate the same data blocks. |
Files can be accessed directly in LFS. | Files can not be accessed directly in DFS because the actual location of data blocks are only known by NameNode. |
LFS is not appropriate for analysis of very big file of data because it needs large time to process. | DFS is appropriate for analysis of big file of data because it needs less amount of time to process as compare to Local file system. |
LFS is less complex than DFS. | DFS is more complex than LFS. |