Duplicati: not a valid Win32 FileTime
You might have landed here searching for the meaning of “not a valid Win32 FileTime” while trying to run a Duplicati backup. The solution is to find files with an invalid access time (atime). Here’s how I diagnosed and fixed the problem with my samba backup.
Quick fix for Crashplan Linux segfault
If, like me, you are still tolerating Crashplan as your backup solution, it’s likely because it is one of the only companies that make backing up from Linux straightforward. However, part of the Linux Crashplan experience appears to be encountering segfaults. Here’s the fix.
Grokking GATK: Common Pitfalls with the Genome Analysis Tool Kit (and Picard)
Recently I’ve been following the GATK DNASeq Best Practice Pipeline for my limpet sequence data. Here are some of the mistakes I made and how I made them go away.
Secure your Six
As a financially constrained student, like many others, I use apache‘s support for Server Name Indication (SNI) to serve multiple SSL domains from one IP. I’m somewhat competent and the setup seems to work for all of my domains. Yet, some time ago I tried to access one of my VirtualHosts from work over SSL […]
When `True` is not `True`
Today, whilst continuing development on Goldilocks, I discovered a minor oddity that left me a little confused and bemused before lunch: True did not appear to be True… Part of Goldilocks‘ functionality allows for the filtering of results; users may specify a dictionary of criteria whose keys map to functions to be applied to result […]
Playing Phylogenetic Hide and Seek with Protozoa
Amanda suggested that alongside archaeal, bacterial and fungal associated hydrolases, we should also look at protozoans. No problem, I’ll just get the taxonomy ID for protozoa and extract another database from UniProtKB as before. Simple! Or so I thought… The rabbit hole is pretty deep on this one. Feel free to skip my multi-day exploration […]
TrEMBLing
Something appears amiss with TrEMBL, millions of sequences are “missing”. Where did they go? At the end of last month, to build a database of bacterial sequences with known hydrolase activity1, I extracted around 2.9 million sequences from UniProtKB/TrEMBL; a popular database which contains sequences that have been automatically annotated and are awaiting manual curation […]