HDFS 和 NFS 的区别 - 芒果文档

在本文中，我们将讨论HDFS (Hadoop 分布式文件系统)和NFS (网络文件系统)之间的区别。

HDFS(Hadoop分布式文件系统)：
它是一种分布式文件系统，用于处理在商品硬件上运行的大量数据，其中数据分布在许多数据节点或联网计算机之间。

它主要用于将单个 Apache Hadoop 集群扩展到数百个甚至数千个节点。它被认为是 Apache Hadoop 的主要组件之一。它与 Apache HBase 不同，后者是一种位于 HDFS 之上的面向列的非关系 DBMS，后者可以通过其内存处理引擎更好地支持实时数据。

它主要用于存储大数据，也使其负责更快的数据交易。
该文件系统存储文件的多个副本，这就是它被称为容错的原因。这里默认的复制级别是 3。

NFS(网络文件系统)：
此 NFS 文件系统是一个分布式文件系统，允许其客户端通过网络访问文件。这个文件系统是一个开放标准。这就是该文件系统可以轻松实现的原因。最初，这个文件系统是为实验目的而创建的，但后来它的第二个变种在第一次成功后发布供公众使用。

所有数据都累积在一个主系统上，网络的所有其余系统都可以访问存储在该系统上的数据，就像存储在其本地系统中一样。但是这里出现了一个问题。如果主系统出现故障，则数据丢失的可能性很高，此时存储也依赖于该系统上的可用空间。

这里，mount 命令用于访问导出的数据。数据访问成功后，客户端可以在指定的参数范围内与文件系统互联。

HDFS 和 NFS 之间的区别：

HDFS 相对于 NFS 的优势?
除了容错之外，HDFS 确实支持文件的多个副本，这避免了许多客户端访问单个文件的常见瓶颈。由于在不同的物理磁盘上有多个副本，它的读取性能比 NFS 更好。

表格形式的 HDFS 和 NFS 之间的区别：

Criteria	HDFS	NFS
Definition	It is a file system in which data is distributed among many data nodes or networked computers.	It is a file system or protocol which allows its client to access the file over the network.
Supporting Data Size –	It is mainly used to store and process big data.	It can store and process a small amount of data.
Data storage –	Its data blocks are dispersed on the local drives of hardware.	Data is stored on a single dedicated hardware.
Reliability –	Its data is stored reliably. Here, data is available even after machine failure.	No reliability, data is not available in case of machine failure.
Data Redundancy –	It runs on a cluster of different machines, data redundancy may occur due to replication protocol.	It runs on a single machine, with no chance of data redundancy.
Domain –	It is for multi-domain.	It is for a single domain.
Client-Server Trust –	Here, client identity is trusted by the OS.	Here, client identity is trusted by default.
Compatability with O/S –	It has different calls. It is mainly used for non-interactive programs.	It has the same system calls as O/S.