📜  bash 如何修剪每第 n 行 - Shell-Bash 代码示例

📅  最后修改于: 2022-03-11 14:49:32.082000             🧑  作者: Mango

代码示例1
# Basic syntax:
awk '(FNR%4==0){print substr($column_number, start_character, number_characters);next} 1' your_file.txt > trimmed_file.txt
# Where:
#    - FNR is the current file record number (typically the line number)
#    - (FNR%4==0) checks whether the current line number is evenly divisible by
#        4, to effectively trim every 4th row
#    - For line numbers that are evenly divisible by 4, trim them with substr:
#        - column_number is the field for which you want a substring
#        - start_character is the number of the first character from the left
#            (inclusive) that you want to incude in the substring
#        - number_characters is the number of characters to print past the 
#            start character (inclusive)
#        - next skips all further statements in the awk string, causing awk
#            to move on to the next line of the file
#    - For line numbers that aren't evenly divisible by 4:
#        - 1 causes the line to be printed as is (same as $0)

# Example usage:
# Say you have a file with the following contents and want to trim every
# second and fourth line to a length of 30
@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=72
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACCAAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA
+SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=72
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9ICIIIIIIIIIIIIIIIIIIIIDIIIIIII>IIIIII/
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGAAGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT
+SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBIIIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I
# Running:
awk '(FNR%2==0 || FNR%4==0){print substr($0,1,30);next} 1' your_file.txt > output.txt
# Would produce:
@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=72
GGGTGATGGCCGCTGCCGATGGCGTCAAAT
+SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=72
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
GTTCAGGGATACGACGTTTGTATTTTAAGA
+SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII
# Note, the length=72 is no longer accurate after trimming