I need your help. I have a FASTA file, where a lot of genes with different lengths are stored.
Here is an example:
'>ENSMUSG00000031109|X|49009707|49288259 ATGACGCTGCCTGTGTCTGATCCAGCTGCATGGGCCACAGCAATGAATAATCTTGGAATG GCTCCACTGGGAATTGCTGGACAACCAATTTTACCTGACTTCGATCCTGCCCTTGGGATG ATGACTGGAATACCACCAATAACTCCCATGATGCCGGGTTTGGGCATAGTCCCGCCACCG ATTCCTCCAGATATGCCGGTAGCAAAGGAGATCATACACTGCAAAAGCTGCACGCTCTTC CCTCCCAACCCAAATCTTCCACCACCTGCAACACGAGAAAGGCCACCAGGCTGTAAGACA GTGTTTGTGGGTGGCCTGCCTGAAAATGGGACAGAGCAGATCATTGTGGAAGTGTTTGAA CAGTGTGGAGAGATTATTGCTATCCGGAAGAGCAAAAAGAACTTCTGTCACATTCGCTTT AACTTCACAAAAGCACAACGTAAAAACATCAGTGTTTGGTGCAAACAAGCTGAGGAAATT'
I want to store these different genes in a dictionary. So the title and afterwords the sequence. I was thinking maybe using regex, for the key, as all of the titles begin with <. But does anybody has an idea/tip how to best do it?