Understanding File Formats in Bioinformatics: VCF and gVCF

  Рет қаралды 11,454

Bioinformagician

Bioinformagician

Күн бұрын

This is a quick video going over a very commonly used file format while performing variant calling analysis - VCF file. In this video, I will go over various fields in a VCF file while taking a look at an example VCF, understanding how the data is organized and what information do various fields store. In addition, I explain what are genotypes, difference between phased and unphased genotype, how to calculate alternate allele frequency and look at how DNA variations are recorded. Lastly, I also discuss what is a gVCF file and in what ways a gVCF file differs from a VCF file.
I hope you find this video helpful! Leave your thoughts in the comment section below!
FASTA/FASTQ format:
• Understanding Bioinfor...
SAM/BAM file format:
• Understanding Bioinfor...
Chapters:
0:00 Intro
0:40 What is a VCF file and how is it generated?
2:38 Main sections of a VCF file
3:27 Metadata section
5:51 Header line
6:51 Data lines - description of fields
13:13 Genes and alleles
14:30 Understanding genotype
15:33 What does genotype 2/0 or 1/2 mean?
17:02 Difference between GT:0/1 and GT:0|1 - phased vs unphased genotype
10:05 How are variants recorded in a VCF file?
22:01 Interpreting a record in VCF
24:45 Genomic VCF (gVCF)
Like the videos I create? Show your support and encouragement by buying me a coffee:
www.buymeacoffee.com/bioinfor...
To get in touch:
Website: bioinformagician.org/
Github: github.com/kpatel427
Email: khushbu_p@hotmail.com
#bioinformagician #bioinformatics #vcf #gvcf #gatk #haplotype #alleles #variantcalling #geneticvariants #mutations #gff3 #gff #gtf #sam #bam #phred #fasta #fastq #singlecell #10X #ensembl #biomart #annotationdbi #annotables #affymetrix #microarray #affy #ncbi #genomics #beginners #tutorial #howto #omics #research #biology #GEO #rnaseq #ngs

Пікірлер: 43
@mosesbaraza3369
@mosesbaraza3369 3 ай бұрын
Quite explicit explanation and detailed and very chronologically arranged. Looking forward to learn in subsequent lessons
@magdalineakinyi5928
@magdalineakinyi5928 7 ай бұрын
I am a bioinformatics student,just began my studies and I have really learnt a lot from your content 😊
@isadoramachadoghilardi3168
@isadoramachadoghilardi3168 Жыл бұрын
Excellent video! I'm in love with your channel!! Congratulations!! I'm starting in this world of bioinformatics, and your videos have helped me a lot! Thank you!
@hubijohn7451
@hubijohn7451 6 ай бұрын
Am I glad I found this channel. Great stuff!
@Tekofilic
@Tekofilic Жыл бұрын
Had always been looking for such a video. Thank you so much :D
@josephinecudjoe3207
@josephinecudjoe3207 2 ай бұрын
I have been blessed by your videos. Thank you.
@yuxiang4218
@yuxiang4218 9 ай бұрын
Very helpful! Thanks for sharing.
@seetarajpara7626
@seetarajpara7626 Жыл бұрын
I love your channel!! Your content is so well organized, thank you so much!
@alexandrakassis3525
@alexandrakassis3525 Жыл бұрын
Thank you so much for sharing this information and your knowledge! Very much appreciated. Could you please make a video on doing a joint variant calling? And also, what you would do for joint calling on rna-seq data?
@user-ur6nm1fn6w
@user-ur6nm1fn6w 8 ай бұрын
Thanks - great teaching.
@user-zv7cg1mn2i
@user-zv7cg1mn2i 8 ай бұрын
Thanks a lot. It was very useful.
@giovannapg7532
@giovannapg7532 Жыл бұрын
OMG such a good video!!! You can explain everything so amazingly ❤ Could you please one day make a tutorial about data set integration on Seurat, as 10X genomic and Smart-seq2 integration??? Thank you!!
@Bioinformagician
@Bioinformagician Жыл бұрын
Definitely have plans to make a video covering this. Thanks for the suggestion!
@jattpigeonscorner9368
@jattpigeonscorner9368 Жыл бұрын
Thank you!
@abebemisganaw7377
@abebemisganaw7377 Ай бұрын
exciting video. Could you upload another video about how to analyze data using VCF tools in a Linux environment
@tapanbaral8939
@tapanbaral8939 Жыл бұрын
Really informative tutorial. Could you please make a video on TMB and MSI ?
@minxie2210
@minxie2210 Жыл бұрын
Thank you for the great video. One quick question regarding the "What does genotype 2/0 or 1/2 mean?" section. In the 4 examples you are given, should the second one be C/T instead of C/A from the genotype numbers? Thanks again, really appreciate your effort in making all the great videos!!
@biomagician
@biomagician 4 ай бұрын
Absolutely fantastic video! Thank you! Does a gVCF always respect the VCF format or is there a distinct gVCF format? Can you tell us more about the multi-sample VCF formats jVCF and MSVCF? Thanks!
@faezedarbaniyan1787
@faezedarbaniyan1787 29 күн бұрын
Thank you so much for elaborating this. I can't relate the definition of Allele Frequency that you mentioned here for rows 2 and 3 in your sample (at 23:44 minutes). Can you please explain it for those?
@humarafique3093
@humarafique3093 5 ай бұрын
Really really amazinggggggg and informative video for the beginners. At 16:40 the position 491520 where the GT is 1/2, there shouldn't be C/CAC instead of CAC/C?
@alexandrakassis3525
@alexandrakassis3525 Жыл бұрын
Where can I find your power points you use in your videos?
@user-up1sm2uh2r
@user-up1sm2uh2r Жыл бұрын
Such a great lecture! I am just wondering if there is a typo at 17:00, the second row of the table at 332470 position. It has to be C/T not C/A or is there anything I missed?
@Bioinformagician
@Bioinformagician Жыл бұрын
Yes, that is a typo. It should have been A instead of T.
@AshishKumar-el8sb
@AshishKumar-el8sb Жыл бұрын
If i have inserted the part of the same genome in a genome how can i find it
@nabildhifallah361
@nabildhifallah361 8 ай бұрын
YES IFOUND THI VIDEO HELPFULL because i can use the whole information about the chromosome and the position the single nucleotide poistion on that chromosome (ALT) compared with the reference of DNA sequence with that i can see well if i have an insertion or convertion or deletion in the dna sample .i am thanking you for your best explanation for the metadataline ,the header and the format .thank you
@AshishKumar-el8sb
@AshishKumar-el8sb Жыл бұрын
How to extract total genes from the genome files.
@stemcell1167
@stemcell1167 Жыл бұрын
Is there a way to get Allele frequency for each sample in multisample VCF file OR is there a way to get AO and RO .
@sauravroy3420
@sauravroy3420 Жыл бұрын
you can slit the sample using bcftools and then use it accordingly
@kajalpanchal8239
@kajalpanchal8239 Жыл бұрын
everything is soo good but am i the only one who is facing sound issue? can you please consider that your sound level is really low. otherwise you are a saviour
@Bioinformagician
@Bioinformagician Жыл бұрын
Thank you for pointing it out. I will try to maintain optimal sound levels for my future videos :)
@mostafaismail4253
@mostafaismail4253 Жыл бұрын
Can You make a tutorial on BS-seq and copy number variations (CNV)? It will be great if you did it 💛 Thanks too much .
@mostafaismail4253
@mostafaismail4253 Жыл бұрын
Really you are life saver for my tasks.
@Bioinformagician
@Bioinformagician Жыл бұрын
Thanks for the suggestion, I will surely consider covering these topics in future videos :)
@sonalvishwakarma30
@sonalvishwakarma30 Жыл бұрын
I want to make a request. Could you please make videos on RepeatMasker it would be really helpful
@anmolpardeshi3138
@anmolpardeshi3138 Жыл бұрын
16:59 - 332470 - shouldn't that be CT or TC - since, for that position, T is reference allele (0) and C is 1st alternate allele (1) - how did you get C/A?
@Bioinformagician
@Bioinformagician Жыл бұрын
It’s a typo. It should be T
@anmolpardeshi3138
@anmolpardeshi3138 Жыл бұрын
@@Bioinformagician thanks for the clarification and wonderful videos. I'm trying to make such an effort too. One suggestion would be to pin such clarifications so that they are not lost in a myriad of comments.
@AshishKumar-el8sb
@AshishKumar-el8sb Жыл бұрын
chrM what it denotes
@vinaydeep26
@vinaydeep26 Жыл бұрын
is the position of the variant with respect to the chromosome? or the whole reference? if there is chr 20 position: 1000 does it mean the variant is from the start of the reference or the chromosome?
@Bioinformagician
@Bioinformagician Жыл бұрын
Position on the chromosome
@njagimwaniki4321
@njagimwaniki4321 2 ай бұрын
How can a VCF record exist where the genotype is 0|0 ? Doesn’t that mean that both the chromosomes match the reference?
@MuhammadFaizan-mi9yo
@MuhammadFaizan-mi9yo Жыл бұрын
I have a very seruious query that got stuck at a point due to which all my projects are halted and I know you can answer my query. if you are willing to help plz reply I will post my query madam. I would be obliged to you plz take this as a request
@jeetnanshi4357
@jeetnanshi4357 5 ай бұрын
Im sorry but the tone is very monotonus. use a marker or please take a break :(
УГАДАЙ ГДЕ ПРАВИЛЬНЫЙ ЦВЕТ?😱
00:14
МЯТНАЯ ФАНТА
Рет қаралды 4 МЛН
Looks realistic #tiktok
00:22
Анастасия Тарасова
Рет қаралды 105 МЛН
Gym belt !! 😂😂  @kauermtt
00:10
Tibo InShape
Рет қаралды 15 МЛН
5 genomics file formats you must know
19:10
OMGenomics
Рет қаралды 24 М.
SNPs and variant calling
4:04
DnA lab short read sequencing workshop
Рет қаралды 7 М.
Understanding Bioinformatics File Formats: GFF3 and GTF
12:01
Bioinformagician
Рет қаралды 7 М.
Understanding Bioinformatics File Formats: SAM/BAM
7:07
Bioinformagician
Рет қаралды 14 М.
Bioinformatics Pipelines for Beginners
44:46
OGGY INFORMATICS
Рет қаралды 9 М.
BCFTOOLS Tutorial | How I Extract information from a vcf file
45:33
Bioinformatics Coach
Рет қаралды 3,9 М.
УГАДАЙ ГДЕ ПРАВИЛЬНЫЙ ЦВЕТ?😱
00:14
МЯТНАЯ ФАНТА
Рет қаралды 4 МЛН