使用Python和 Graphviz 将 Epsilon-NFA 转换为 DFA

有限自动机 (FA)是一种简单的机器，用于匹配输入字符串中的模式。有限自动机是一个五元组，即它有五个元素。在本文中，我们将了解如何使用Python和 Graphviz 将 epsilon-NFA 转换为 DFA。

FA 的五元组表示为：

$\langle Q, \Sigma, q_0, F, \delta \rangle$

五要素是：

有限状态集 (Q)
一组有限的输入字母 ( $\Sigma$ )
开始状态 ( ${q_0}$ )
一组有限的最终状态 (F)
转换函数( $\delta$ )

有限自动机（FA）有两种类型：

确定性有限自动机 (DFA)
非确定性有限自动机 (NFA)

有限自动机的类型：

确定性有限自动机 (DFA)

确定性有限自动机 (DFA)是一种 FA，其中对于转换函数中定义的输入字母表，机器只能进入一个固定状态。

DFA 不允许 ${\epsilon}$ (null) 字母表，这意味着如果没有检测到字母表，机器将不会改变状态。

对于 DFA，

$\delta:Q X \Sigma\rightarrow Q$

例如，DFA 接受由 f 和 g 组成的所有字符串，其中包含“gfg”作为子字符串。

DFA

非确定性有限自动机 (NFA)

非确定性有限自动机 (NFA)是一种 FA，其中机器可以针对输入字母表进入多个状态。

对于 NFA，

$\delta: Q X \Sigma\rightarrow 2^Q$

NFA 允许 $\epsilon$ 移动称为E-NFA或Epsilon-NFA 。

NFA 允许 $\epsilon$ (null) 字母表意味着即使没有检测到输入字母表，机器也可以改变状态。

对于 E-NFA，

$\delta: Q X (\Sigma\cup\epsilon)\rightarrow 2^Q$

例如，NFA 接受由 f 和 g 组成的所有字符串，其中包含“gfg”作为子字符串。

NFA

这 $2^Q$ 是因为，对于每个过渡，都有 2 种可能性，过渡或不过渡。这适用于每个 Q 状态。所以 $2^Q$ 将是每个转换的全部可能配置。

为什么要转换？

计算机可以理解 FA 的基本形式，即 DFA。但是由于 NFA 的特性，我们人类可以轻松理解 NFA 和更好的 E-NFA。所以我们需要将 E-NFA 转换为 DFA。

将 E-NFA 转换为 DFA 的步骤：

$\epsilon$ – 闭包：这是我们可以在没有任何输入的情况下进入的状态集，即 $\epsilon$ 移动。

第 1 步：查找 $\epsilon$ – NFA 开始状态的关闭，这将是 DFA 的开始状态。

第 2 步：从这组开始，对每个字母表进行评估 $\epsilon$ – 关闭此字母表的转换集。

步骤 3：对于我们遇到的每个新的闭包集合，我们将重复步骤 2，直到没有新的集合。

Step 4: DFA 中包含 NFA 最终状态的集合将成为最终集合状态。

例如，

让给定的 E-NFA 为：

NFA

NFA： $\langle Q, \Sigma, q_0, F, \delta \rangle$

问：{A、B、C、D}

$\Sigma$ : {a, b, c}

$q_0$ ：一种

F：{D}

$\delta$ ：

NFA	a	b	c	$\mathbf{\epsilon}$
A	A	–	–	BC
B	–	BD	–	–
C	–	–	CD	–
D	–	–	–	–

它是一种语言的 NFA，它接受以下类型的字符串：{ ${a^lb^mc^n}$ ，在哪里 ${l\ge0,m\ge1,n\ge1}$ }

转换步骤：

States	${\epsilon}$ – closure
A	ABC
B	B
C	C
D	D

第 1 步：查找 $\epsilon$ – NFA 开始状态的关闭，这将是 DFA 的开始状态。

$\epsilon$ – 关闭 NFA 的启动状态。

$\epsilon$ – 关闭 (A) : {A,B,C}

步骤 2,3：从这组开始，对于每个字母，评估 $\epsilon$ – 关闭此字母表的转换集。和对于我们遇到的每一组新的闭包集，我们将重复步骤 2，直到没有新的集合被留下。

DFA 的当前状态集：{ABC}

{ABC} -> a = A -> a : ABC
{ABC} -> b = BD -> b : BD
{ABC} -> c = CD -> c : CD

DFA 的当前状态集：{ABC, BD, CD}

{BD} -> a = $\phi$
{BD} -> b = BD
{BD} -> c = $\phi$
{CD} -> a = $\phi$
{CD} -> b = $\phi$
{CD} -> c = CD

DFA、Q 的国家：{ABC、BD、CD、 $\phi$ }

转移函数 $\delta$ ：

DFA	a	b	c
ABC	ABC	BD	CD
BD	$\phi$	BD	$\phi$
CD	$\phi$	$\phi$	CD
$\phi$	$\phi$	$\phi$	$\phi$

Step 4: DFA 中包含 NFA 最终状态的集合将成为最终集合状态。

D 是 NFA 的最终状态。因此，在其集合中包含 D 的所有状态都将是 DFA 中的最终状态。

所以 DFA, F 的最终状态：{BD, CD}

DFA 获得：

DFA

$\phi$ 也称为死状态，因为它没有出边。所以机器到达死亡状态后，就无法到达最终状态。

用代码实现的工具：

Graphviz：它是一个用于可视化图表的Python库。

要在Python中安装 graphviz，请在终端中运行以下命令：

pip install graphviz

先决条件：

输入：

状态数： no_state
状态数组：状态
字母数量： no_alphabet
字母数组：字母
开始状态：开始
最终状态数： no_final
最终状态数组：决赛
转换次数： no_transition
转换数组：转换
转换的类型：[From State, Alphabet, To State]

效用：

从状态获取索引的字典/映射： states_dict
字典/地图从字母表中获取索引： alphabets_dict
转换表字典从“状态”和“字母”对获取“到”状态数组： transition_table
存储 NFA 图形的 Digraph 对象： graph

方法/功能：

NFA 类的构造函数： __init__()
它初始化所有输入变量并根据输入评估实用变量值。
从用户那里获取输入： fromUser()
获取用户输入的类方法。
NFA 五元组的表示： __repr__()
查找状态的 Epsilon 闭包： getEpsilonClosure(state)
为了找到一个状态的 epsilon 闭包，我们维护一个堆栈来获取接下来要评估的状态，并维护一个字典来跟踪哪些状态已经被评估。
我们从 start 状态开始一个 while 循环，并找到它的 epsilon 转换状态。
我们将所有这些状态推送到堆栈 (stack.push(stata)) 并在字典中标记它们 (dict[state]=0)。
我们还在字典中将此状态标记为完整（dict[state]=1）。
对于每次下一次迭代，我们每次都弹出堆栈的顶部，如果没有通过检查字典来评估它，则对其进行评估。
从状态数组中为转换后的 DFA 图找到适当的状态名称： getStateName(state_list)
由于我们将从评估中获得状态列表，因此要在 DFA 图中显示，我们需要一个正确的名称，它将是集合中所有名称的串联。
例如：要状态设置为 ={A,B,D} ，则此函数将返回一个字符串= “ABD”。
检查数组是否包含 NFA 的最终状态以查找数组是否将是 DFA 中的最终状态： isFinalDFA(state_list)
此函数检查状态列表是否包含 NFA 中的最终状态，这反过来将判断该集合是否为最终状态。
例如：我们正在检查的集合是 = {A,B,D}，并且 D 是 NFA 中的最终状态。所以这个函数将为这个输入集返回 True。所以这个集合将是 DFA 中的最终状态。

方法：

创建 NFA 类的对象： nfa ，并使用构造函数或使用用户输入将其初始化为预定义的值。
根据输入值用节点和边初始化nfa.graph 。
显示/渲染 NFA 图。
制作另一个 Digraph 对象来存储获得的 DFA 的值并渲染图表： dfa 。
评估 NFA 的所有状态的 epsilon 闭包以不每次都重新计算，并将其存储在字典中，键值对为 [state -> list of states ofclosure]： epsilon_closure{}
创建一个堆栈以跟踪接下来要评估的 DFA 状态： dfa_stack[]
添加 NFA 的起始状态的 epsilon 闭包作为 DFA 的起始状态。
制作一个列表以维护 DFA 中存在的所有状态： dfa_states[]
开始一个 while 循环，我们一直持续到 dfa_stack 中没有新状态为止。
我们弹出dfa_stack的顶部来评估当前状态集： cur_state。
我们遍历当前状态集的所有字母表。
我们创建一个集合来维护当前集合中所有状态的 epsilon 闭包： from_closure{}
如果这个集合不为空，
- 我们制作另一个集合来维护to_state集合。
- 如果该集合在dfa_states中不存在，那么我们将其附加到dfa_stack和dfa_states 中。
- 然后我们在dfa图中添加这个节点，并在cur_state和to_state 之间添加一条边。
否则这个集合是空的，那么
- 这种情况是死状态。
- 如果dfa中不存在死状态，那么我们添加新状态 $\phi$ .我们将所有字母都转换为自身，这样机器就永远不会离开死状态。
- 我们将cur_state转换为这个 dead_state。
最后，所有的状态都被评估了。所以我们将渲染/查看这个dfa图。

下面是完整的实现：

Python3

# Conversion of epsilon-NFA to DFA and visualization using Graphviz
 
from graphviz import Digraph
 
class NFA:
    def __init__(self, no_state, states, no_alphabet, alphabets, start,
                 no_final, finals, no_transition, transitions):
        self.no_state = no_state
        self.states = states
        self.no_alphabet = no_alphabet
        self.alphabets = alphabets
         
        # Adding epsilon alphabet to the list
        # and incrementing the alphabet count
        self.alphabets.append('e')
        self.no_alphabet += 1
        self.start = start
        self.no_final = no_final
        self.finals = finals
        self.no_transition = no_transition
        self.transitions = transitions
        self.graph = Digraph()
 
        # Dictionaries to get index of states or alphabets
        self.states_dict = dict()
        for i in range(self.no_state):
            self.states_dict[self.states[i]] = i
        self.alphabets_dict = dict()
        for i in range(self.no_alphabet):
            self.alphabets_dict[self.alphabets[i]] = i
             
        # transition table is of the form
        # [From State + Alphabet pair] -> [Set of To States]
        self.transition_table = dict()
        for i in range(self.no_state):
            for j in range(self.no_alphabet):
                self.transition_table[str(i)+str(j)] = []
        for i in range(self.no_transition):
            self.transition_table[str(self.states_dict[self.transitions[i][0]])
                                  + str(self.alphabets_dict[
                                      self.transitions[i][1]])].append(
                                          self.states_dict[self.transitions[i][2]])
 
    # Method to get input from User
    @classmethod
    def fromUser(cls):
        no_state = int(input("Number of States : "))
        states = list(input("States : ").split())
        no_alphabet = int(input("Number of Alphabets : "))
        alphabets = list(input("Alphabets : ").split())
        start = input("Start State : ")
        no_final = int(input("Number of Final States : "))
        finals = list(input("Final States : ").split())
        no_transition = int(input("Number of Transitions : "))
        transitions = list()
        print("Enter Transitions (from alphabet to) (e for epsilon): ")
        for i in range(no_transition):
            transitions.append(input("-> ").split())
        return cls(no_state, states, no_alphabet, alphabets, start,
                   no_final, finals, no_transition, transitions)
 
    # Method to represent quintuple
    def __repr__(self):
        return "Q : " + str(self.states)+"\nΣ : "
        + str(self.alphabets)+"\nq0 : "
        + str(self.start)+"\nF : "+str(self.finals) + \
            "\nδ : \n" + str(self.transition_table)
 
    def getEpsilonClosure(self, state):
       
        # Method to get Epsilon Closure of a state of NFA
        # Make a dictionary to track if the state has been visited before
        # And a array that will act as a stack to get the state to visit next
        closure = dict()
        closure[self.states_dict[state]] = 0
        closure_stack = [self.states_dict[state]]
 
        # While stack is not empty the loop will run
        while (len(closure_stack) > 0):
           
            # Get the top of stack that will be evaluated now
            cur = closure_stack.pop(0)
             
            # For the epsilon transition of that state,
            # if not present in closure array then add to dict and push to stack
            for x in self.transition_table[
                    str(cur)+str(self.alphabets_dict['e'])]:
                if x not in closure.keys():
                    closure[x] = 0
                    closure_stack.append(x)
            closure[cur] = 1
        return closure.keys()
 
    def getStateName(self, state_list):
       
        # Get name from set of states to display in the final DFA diagram
        name = ''
        for x in state_list:
            name += self.states[x]
        return name
 
    def isFinalDFA(self, state_list):
       
        # Method to check if the set of state is final state in DFA
        # by checking if any of the set is a final state in NFA
        for x in state_list:
            for y in self.finals:
                if (x == self.states_dict[y]):
                    return True
        return False
 
 
print("E-NFA to DFA")
 
# INPUT
# Number of States : no_state
# Array of States : states
# Number of Alphabets : no_alphabet
# Array of Alphabets : alphabets
# Start State : start
# Number of Final States : no_final
# Array of Final States : finals
# Number of Transitions : no_transition
# Array of Transitions : transitions
 
nfa = NFA(
    4,  # number of states
    ['A', 'B', 'C', 'D'],  # array of states
    3,  # number of alphabets
    ['a', 'b', 'c'],  # array of alphabets
    'A',  # start state
    1,  # number of final states
    ['D'],  # array of final states
    7,  # number of transitions
    [['A', 'a', 'A'], ['A', 'e', 'B'], ['B', 'b', 'B'],
     ['A', 'e', 'C'], ['C', 'c', 'C'], ['B', 'b', 'D'],
     ['C', 'c', 'D']]
   
    # array of transitions with its element of type :
    # [from state, alphabet, to state]
)
 
# nfa = NFA.fromUser() # To get input from user
# print(repr(nfa)) # To print the quintuple in console
 
# Making an object of Digraph to visualize NFA diagram
nfa.graph = Digraph()
 
# Adding states/nodes in NFA diagram
for x in nfa.states:
    # If state is not a final state, then border shape is single circle
    # Else it is double circle
    if (x not in nfa.finals):
        nfa.graph.attr('node', shape='circle')
        nfa.graph.node(x)
    else:
        nfa.graph.attr('node', shape='doublecircle')
        nfa.graph.node(x)
 
# Adding start state arrow in NFA diagram
nfa.graph.attr('node', shape='none')
nfa.graph.node('')
nfa.graph.edge('', nfa.start)
 
# Adding edge between states in NFA from the transitions array
for x in nfa.transitions:
    nfa.graph.edge(x[0], x[2], label=('ε', x[1])[x[1] != 'e'])
 
# Makes a pdf with name nfa.graph.pdf and views the pdf
nfa.graph.render('nfa', view=True)
 
# Making an object of Digraph to visualize DFA diagram
dfa = Digraph()
 
# Finding epsilon closure beforehand so to not recalculate each time
epsilon_closure = dict()
for x in nfa.states:
    epsilon_closure[x] = list(nfa.getEpsilonClosure(x))
 
 
# First state of DFA will be epsilon closure of start state of NFA
# This list will act as stack to maintain till when to evaluate the states
dfa_stack = list()
dfa_stack.append(epsilon_closure[nfa.start])
 
# Check if start state is the final state in DFA
if (nfa.isFinalDFA(dfa_stack[0])):
    dfa.attr('node', shape='doublecircle')
else:
    dfa.attr('node', shape='circle')
dfa.node(nfa.getStateName(dfa_stack[0]))
 
# Adding start state arrow to start state in DFA
dfa.attr('node', shape='none')
dfa.node('')
dfa.edge('', nfa.getStateName(dfa_stack[0]))
 
# List to store the states of DFA
dfa_states = list()
dfa_states.append(epsilon_closure[nfa.start])
 
# Loop will run till this stack is not empty
while (len(dfa_stack) > 0):
    # Getting top of the stack for current evaluation
    cur_state = dfa_stack.pop(0)
 
    # Traversing through all the alphabets for evaluating transitions in DFA
    for al in range((nfa.no_alphabet) - 1):
        # Set to see if the epsilon closure of the set is empty or not
        from_closure = set()
        for x in cur_state:
            # Performing Union update and adding all the new states in set
            from_closure.update(
                set(nfa.transition_table[str(x)+str(al)]))
 
        # Check if epsilon closure of the new set is not empty
        if (len(from_closure) > 0):
            # Set for the To state set in DFA
            to_state = set()
            for x in list(from_closure):
                to_state.update(set(epsilon_closure[nfa.states[x]]))
 
            # Check if the to state already exists in DFA and if not then add it
            if list(to_state) not in dfa_states:
                dfa_stack.append(list(to_state))
                dfa_states.append(list(to_state))
 
                # Check if this set contains final state of NFA
                # to get if this set will be final state in DFA
                if (nfa.isFinalDFA(list(to_state))):
                    dfa.attr('node', shape='doublecircle')
                else:
                    dfa.attr('node', shape='circle')
                dfa.node(nfa.getStateName(list(to_state)))
 
            # Adding edge between from state and to state
            dfa.edge(nfa.getStateName(cur_state),
                     nfa.getStateName(list(to_state)),
                     label=nfa.alphabets[al])
             
        # Else case for empty epsilon closure
        # This is a dead state(ϕ) in DFA
        else:
           
            # Check if any dead state was present before this
            # if not then make a new dead state ϕ
            if (-1) not in dfa_states:
                dfa.attr('node', shape='circle')
                dfa.node('ϕ')
 
                # For new dead state, add all transitions to itself,
                # so that machine cannot leave the dead state
                for alpha in range(nfa.no_alphabet - 1):
                    dfa.edge('ϕ', 'ϕ', nfa.alphabets[alpha])
 
                # Adding -1 to list to mark that dead state is present
                dfa_states.append(-1)
 
            # Adding transition to dead state
            dfa.edge(nfa.getStateName(cur_state,),
                     'ϕ', label = nfa.alphabets[al])
 
# Makes a pdf with name dfa.pdf and views the pdf
dfa.render('dfa', view = True)

输出：

DFA

将 NFA 转换为 DFA 的算法复杂度：

时间复杂度： $O(2^Q\times N)$

空间复杂度： $O(2^Q\times N)$

其中，Q = NFA 的国家数量，N = NFA 的字母数量。