Automatically keeping a GitHub fork up to date

We recently setup a departmental GitHub account for Hutton ICS, and one of the things we'll use this for is to showcase projects which ICS staff are contributing to - such as Biopython in my case.

To start with we have forked https://github.com/biopython/biopython as https://github.com/huttonics/biopython which we'll use as a read-only mirror - but now we want to keep it up to date with commits pushed to the upstream repository.

How can we automatically mirror the upstream repository? Enter GitHub Deploy Keys, which we can use to grant read/write access on a repository basis - which a cron job can use to push changes to our mirrored git repository.


What BLAST's max-target-sequences doesn't do

This is a short post to highlight a scary BLAST+ -max_target_seqs bug found and reported by Sujai Kumar, which he discovered in the course of working on some puzzling Blobtools output while analysing the tardigrade genome.


BLAST XML 2 - does the sequel live up to my hopes?

Last year I wrote a blog post "BLAST XML output needs more love from NCBI", and in the numerous updates to this, tracked the NCBI outreach and then release of BLAST XML 2.

The new output format was included in BLAST+ 2.2.31 as output format 15, without any kind of beta release for user feedback. Later than planned, I was able to give this a try during the Galaxy Community Conference 2015 Hackathon. Sadly the worries voiced on the OBF Bio* mailing lists were well founded.

In part because XML is so verbose, it is nice to be able to parse it as a stream - meaning capturing the output via stdout and Unix pipes. That appears to be "broken". In fact, producing a bundle of XML files using XInclude seems a recipe for trouble.


NCBI working on SAM output from BLAST+

Recently NCBI BLAST+ 2.2.31 was released, and it contains an undocumented "Easter Egg" - this is still very rough around the edges but they're working on SAM format output!


PrePrint: SAM/BAM format v1.5 extensions for de novo assemblies

Here's a little back-story on my latest preprint (based on my email to samtools-devel), which went live on the biology preprint server bioRvix at the end of last week:
SAM/BAM format v1.5 extensions for de novo assemblies.
Peter J. A. Cock, James K. Bonfield, Bastien Chevreux, Heng Li.
bioRxiv DOI: 10.1101/020024
The current version is a terse three pages (trying to meet an "application note" page limit), but nevertheless should clarify the intended usage of these parts of the SAM/BAM specification.