REVIEW - Sequence Analysis in a Nutshell: A Guide to Tools - A Guide to Common Tools and Databases


Sequence Analysis in a Nutshell: A Guide to Tools

A Guide to Common Tools and Databases


Scott Markel, Darryl Leon


O'Reilly (2003)




Ivan Uemlianin


April 2004



On the whole, it's a good 'nutshell'.

This book is a digest of data format and command-line parameter summaries for a selection of bioinformatics software packages. Its intended audience is working bioinformaticians already familiar with the software. For this audience I can recommend it, as the online documentation that comes with the tools is often lacking. The book is *not* introductory or tutorial in any sense, excepting that it's easier to browse through a 286 page paperback than it is to wade through a handful of help files.

The book is in three parts: Data Formats, Tools& "Appendixes".

Part 1 covers FASTA, GenBank/DDBJ, and EMBL DNA formats, and protein formats SWISS-PROT, Pfam and PRO-SITE. Part 2 covers: Readseq; BLAST, BLAT, and ClustalW; HMMER; MEME& MAST; and the European Molecular Biology Open Software Suite (EMBOSS, which gets 170 pages). The Appendices cover background material including DNA and amino acid codes, and genetic codes for various organisms. (See the longer review on the web for more discussion, but if you don't know what any of this means, this is not the book for you.)

This documentation is available elsewhere, but not readily, so the book is more than a bunch of man pages in perfect binding. The information is very terse but, if you're working frequently with these tools, that's probably exactly what you want. There is just enough annotation to prod your memory, but no more.

The authors, both practising bioinformaticians, have set out to produce a 'nutshell' book that they would use themselves, and this accounts for the book's strengths. It also accounts for its two main weaknesses. The narrowness of focus is the first of these, and if your needs are fulfilled by the focus, it may not be a weakness at all.

More seriously, there are some hidden gaps in coverage. Some examples: the Pfam chapter does not cover the actual sequence format (i.e. only the #=GF section is covered, not the #=GS& #=GC sections); the BLAST chapter covers only NCBI-BLAST (and not e.g. WU-BLAST); the EMBOSS chapter doesn't mention the EMBASSY range of open source programs.

The appendices might have been cut& pasted from the authors' own post-its. Citations are given, but the reproductions are perfunctory. If you know the sources this will not matter, but it doesn't make for readable documentation. The 'Resources' appendix is a mess. *Some* of the references from the rest of the book are given here, along with a mixture of introductory texts, specialist monographs and a few from O'Reilly.

Faults notwithstanding, the book fulfils its aim reasonably well - with some rough edges, and check the tools you use are included. On the whole, it's a good 'nutshell'. Recommended.

Book cover image courtesy of Open Library.

Your Privacy

By clicking "Accept All Cookies" you agree ACCU can store cookies on your device and disclose information in accordance with our Privacy Policy and Cookie Policy.

By clicking "Share IP Address" you agree ACCU can forward your IP address to third-party sites to enhance the information presented on the site, and that these sites may store cookies on your device.