使用Python检测突变

先决条件： Python中的随机数

以下文章描述了如何使用Python检测突变的 DNA 链。

使用的功能

generateDNASequence()：此方法使用 DNA 碱基 A、C、G、T 列表生成长度为 40 个字符的随机 DNA 链。此方法返回生成的 DNA 链。
applyGammaRadiation()：该方法将上述方法生成的DNA链作为输入参数，然后仅在随机生成的突变概率大于50％时才在随机位置改变链。所选位置的 DNA 碱基必须与替换它的 DNA 碱基不同。此方法返回更改后的 DNA 链。
detectMutation()：该方法将原始和改变的 DNA 链作为输入，并检查两个字符串是否被改变。如果字符串被改变，它会返回改变的 DNA 碱基的位置

入门

遵循以下步骤以实现我们所需的功能：

导入随机库
generateDNASequence()函数用于生成 DNA 链。 DNA 链是通过从可用选项列表中随机选择字符来生成的。当字符串长度变为 40 时，循环完成并返回 DNA 链。
applyGammaRadiation()函数改变 DNA 链。突变的可能性是随机产生的。如果随机产生突变的可能性大于 50，则发生突变，否则不发生突变。如果发生突变，则随机选择突变的位置。
接下来，将 DNA 链中的字符转换为列表。
获取突变位置的字符。由于提取的字符应该与替换它的字符不同，因此我们从可用选项列表中删除提取的字符，以便在其位置选择另一个字符。从列表中选择新的字符或 DNA 碱基。
原始 DNA 链字符再次附加到新列表中。新的 base/字符设置在 mutated 位置。
使用 join() 将 cl 列表中的字符再次转换为字符串。这是新的变异DNA字符串。
如果没有发生突变，原始 dna 与突变的 DNA 相同。
最后返回突变/未突变的 DNA。
然后用detectMutation()函数检测突变。在这个函数中，x 和 y 取 dna 和 cdna字符的每个字符进行逐字符比较。如果相同索引处的字符匹配，则增加计数。如果不匹配，则循环被破坏。
计数值指向突变位置之前的索引。如果 count=40 表示 2 条链的所有字符都匹配，因此没有突变如果 count 小于 40，则表示发生了突变。

下面是实现。

Python3

# import random library
import random
  
# function to generate dna strands
def generateDNASequence():
    
    # list of available DNA bases
    l = ['C', 'A', 'G', 'T']
    res = ""
    for i in range(0, 40):
        # creating the DNA strand by appending 
        # random characters from the list
        res = res + random.choice(l)
    return res
  
# function to alter dna strands
def applyGammaRadiation(dna):
    
    # possibility of mutation is generated randomly
    pos = random.randint(1, 100)
    cdna = ''
      
    # list of available DNA bases
    l = ['C', 'A', 'G', 'T']
      
    # if the possibility of mutation generated randomly
    # is >50 then mutation happens
    if(pos > 50):
        
        # the position where mutation will take place
        # is chosen randomly
        changepos = random.randint(0, 39)
        dl = []
          
        # the characters in DNA strand is converted to list
        dl[:0] = dna
          
        # the character at the determined mutation position 
        # is fetched.
        ch = "" + dl[changepos]
          
        # since the fetched character should be different from 
        # the one replacing it we remove the fetched character
        # from the list of available choices for choosing another
        # character in its place
        l.remove(ch)
          
        # the new character or DNA base is chosen from the list
        ms = random.choice(l)
        cl = []
          
        # DNA strand characters are again appended to a new list
        cl[:0] = dna
          
        # the new base in the mutated position is set
        cl[changepos] = ms
          
        # the characters in the cl list is converted to string again
        # this is the new mutated DNA string
        cdna = ''.join([str(e) for e in cl])
      
    # if possibility of mutation is less than 50% then no 
    # mutation happens
    else:
          
        # if no mutation occurs original dna is same as mutated dna
        cdna = dna
    return cdna
  
# function to detect mutation
def detectMutation(dna, cdna):
    count = 0
      
    # x and y take each character in dna and cdna
    # for character by character comparison
    for x, y in zip(dna, cdna):
         
        # if the character at the same index match
        # then the count is increased
        if x == y:
            count = count + 1
          
        # incase of mismatch the loop is broken
        else:
            break
  
    # the count value points to the index before the 
    # position of mutation
    return count
  
  
dna = generateDNASequence()
print(dna+" (Original DNA)")
cdna = applyGammaRadiation(dna)
print(cdna+" (DNA after radiation)")
count = detectMutation(dna, cdna)
  
# if count=40 it means all the characters of the 2 strands match
# hence no mutation
if count == 40:
    print("No Mutation detected")
  
# if count is less than 40
# it means mutation has occurred
else:
      
    # ^ denotes the position of mutation
    pos = "^"
    print(pos.rjust(count+1))
    print("Mutation detected at pos = ", (count+1))

输出