Ghostbusting

   Sam Nicholls    No Comments yet    Meta

Shortly after setting up this blog, I embedded Google Analytics tracking; primarily because I like numbers but also in hope of discovering that at least one other person who isn’t me or one my supervisors is interested in my adventures. It’s also great writing practice and gives me the chance to properly think through the […]

Pipelines

   Sam Nicholls
[text: 'pipelines', photograph of a standard bioinformatics pipeline transforming data from one mess to another]

[text: ‘pipelines’, photograph of a standard bioinformatics pipeline transforming data from one mess to another]

What am I doing?

   Sam Nicholls    No Comments yet    AU-PhD

A week ago I had a progress meeting with Amanda and Wayne, who make up the supervisory team for the computational face of my project. I talked about how computers are terrible and where the project is heading. As Wayne had been away from meetings for a few weeks, I began with a roundup of […]

`memblame`

   Sam Nicholls    No Comments yet    System Administration, Tools

As a curious and nosy individual who likes to know everything, I wrote a script dubbed memblame which is responsible for naming and shaming authors of “inefficient”1 jobs at our cluster here in IBERS. It takes time, often days, sometimes longer, of patience to see large-input jobs executed on a node on the compute cluster […]

Scratch

   Sam Nicholls
[text: '~/scratch/', photograph of you desperately trying to pack your data for a programmatic excursion only to find that the airline charges by the bit for hold luggage]

[text: ‘~/scratch/’, photograph of you desperately trying to pack your data for a programmatic excursion only to find that the airline charges by the bit for hold luggage]

TrEMBLing

   Sam Nicholls    No Comments yet    Bioinformatics, Mysteries

Something appears amiss with TrEMBL, millions of sequences are “missing”. Where did they go? At the end of last month, to build a database of bacterial sequences with known hydrolase activity1, I extracted around 2.9 million sequences from UniProtKB/TrEMBL; a popular database which contains sequences that have been automatically annotated and are awaiting manual curation […]

The Story so Far: Part I, A Toy Dataset

   Sam Nicholls    No Comments yet    AU-PhD

In this somewhat long and long overdue post; I’ll attempt to explain the work done so far and an overview of the many issues encountered along the way and an insight in to why doing science is much harder than it ought to be. This post got a little longer than anticipated, so I’ve sharded […]

Exit codes, core dumps, `set -e` and `expr`

   Sam Nicholls    One Comment    AU-PhD

The kernels on our cluster clients have recently been updated after I inadvertently stumbled across an old1 kernel bug that caused erratic behaviour when NFS tries to open a directory containing many files that are being written to simultaneously (more on which is another post in itself really, as usual). The update seems to have […]