I have a data frame that I'm working with in which I'd like to compare a data point Genotype
with two references S288C
and SK1
. This comparison will be done across many rows (100+) of the data frame. Here are the first few lines of my data frame:
Assay Genotype S288C SK1
1 CCT6-002 G A G
2 CCT6-007 G A G
3 CCT6-013 C T C
4 CCT6-015 G A G
5 CCT6-016 G G T
As a final product, I'd like a character string of 1's (S288C
) and 0's (SK1
) depending on which of the references the data point matches. Thus in the example above I'd like an output of 00001
since all except the last match SK1
.