Goldilocks: A tool for identifying genomic regions that are “just right”
I’m published! I’m a real scientist now! Check out the application note on Bioinformatics.
Status Report: February 2016
The adventure continues, here’s what I’m working on lately.
Meet the Metahaplome
Yesterday, I gave a talk at the Aberystwyth Bioinformatics Workshop on the metahaplome: a graph inspired structure for encoding the variation of single nucleotide polymorphisms (SNPs) observed across aligned sequenced reads.
How to switch SATA controller driver from RAID to AHCI on Windows 10 without a reinstall
I was bemused to find a Linux live disk unable to identify the storage volume on my new Dell XPS 13 laptop. It seemed I needed to change the SATA controller mode from `RAID` to `AHCI`, but Windows had other ideas. Unable to find a solution online that didn’t cause a boot BSOD, I found my own.
How (not) to subset a BAM for GATK
I wanted a BAM that contained reads aligned to just one of the many contigs the file contained. As usual, I made this much more difficult than it really ought to have been.
Duplicate definition error with GATK PrintReads and MalformedReadFilter
This afternoon I wanted to quickly check whether some reads in a BAM would be filtered out by the GATK `MalformedReadFilter`. Turns out that GATK is pretty unforgiving if you forget that filter is automatically applied by `PrintReads`.
Deeper
Grokking GATK: Common Pitfalls with the Genome Analysis Tool Kit (and Picard)
Recently I’ve been following the GATK DNASeq Best Practice Pipeline for my limpet sequence data. Here are some of the mistakes I made and how I made them go away.
Status Report: October 2015
I’m still alive, here’s some things what I’ve done.
Size Matters
My datasets are so big, they trigger an NFS bug in the kernel