At this point, technical difficulties aside, we’ve got a bit more knowledge regarding how to navigate the SCC and how to gather data of various kinds from Ensembl, so let’s put that knowledge to use with some practice!
This homework assignment is meant to both stretch your abilities from the past two labs, and prepare you for what’s coming in the next lab. If you can’t remember how to do something, check your Pre-Lab slides and the Lab 1 module.
To make things easier, I’ve also created an online interface where you can answer the questions.
Go to the Ensembl web page for the gene UCP1, and look at the variant table.
Go back to the variant table, and filter and sort it to find the Stop Gained variant with the highest minor allele frequency (MAF).
We should probably get a little more practice with the tabix and vcftools coding for downloading data into our SCC space.
Please download the UCP1 data (for the whole gene) for a second sub-population you find interesting in the 1000 Genomes dataset. Of all the files generated, you should keep ONLY the final VCF file.
The file MUST be in your SCC working directory by the time we meet in class on Friday, October 5th!