The adventure continues, here’s what I’m working on lately.
Yesterday, I gave a talk at the Aberystwyth Bioinformatics Workshop on the metahaplome: a graph inspired structure for encoding the variation of single nucleotide polymorphisms (SNPs) observed across aligned sequenced reads.
I was bemused to find a Linux live disk unable to identify the storage volume on my new Dell XPS 13 laptop. It seemed I needed to change the SATA controller mode from `RAID` to `AHCI`, but Windows had other ideas. Unable to find a solution online that didn’t cause a boot BSOD, I found my own.
I wanted a BAM that contained reads aligned to just one of the many contigs the file contained. As usual, I made this much more difficult than it really ought to have been.
This afternoon I wanted to quickly check whether some reads in a BAM would be filtered out by the GATK `MalformedReadFilter`. Turns out that GATK is pretty unforgiving if you forget that filter is automatically applied by `PrintReads`.
Recently I’ve been following the GATK DNASeq Best Practice Pipeline for my limpet sequence data. Here are some of the mistakes I made and how I made them go away.
I’m still alive, here’s some things what I’ve done.
My datasets are so big, they trigger an NFS bug in the kernel
I purchased a Fitbit Charge HR this morning. Although primarily motivated by my partner’s latest pursuit for improved fitness, I guess any excuse for self-quantification is a good one for a statistician like me.