From version 2.0 beadarray provides more flexibility in the processing of array images and the extraction of bead intensities than its predecessor. In the past intensity extraction from array images by beadarray attempted to emulated that performed by Illumina, with minimal opportunities for deviation from this. Whilst the default approach taken in beadarray is still to emulate Illumina, we have made each step modular, in order to allow greater flexibility for the user. This vignette is designed to show how one can read the TIFF images from the BeadArray scanner and implement alternative feature intensity extraction algorithms.
The first step in a pipeline for image processing is to read both the TIFF image and the bead-level text file. The text file contains the identities of each bead, as well as the bead-centre coordinates. The image processing methods contained within beadarray use these coordinates as seed points for intensity extraction, although one can conceive of approaches where bead-centres are calculated seperately, prior to intensity extraction. However, even with such an approach the Probe ID for each bead will need to be extracted from the .txt file.
The standard method employed by Illumina’s scanner for calculating bead intensity is a four step process, described in Kuhn et al . It can be summarized as:
If the function readIllumina()
is called with
useImages = TRUE
then intensities are extracted using code
that gives a very close emulation of that used by Illumina. If one
wished to perform this calculation themselves (outside of
readIllumina()
, it can be done using the following
code.
bg <- illuminaBackground(tiff, data[,3:4])
tiffSharp <- illuminaSharpen(tiff)
fg <- illuminaForeground(tiffSharp, data[,3:4])
finalIntensity <- fg - bg
Each of the functions above take a matrix representing the pixel values from the TIFF image as their first argument. The background and foreground algorithms additionally take a two column matrix containing the coordinates of the bead centres. If one wished to calculate intensities for only a subset of the beads then supplying only the appropriate bead-centres in this step would achieve this
After calculating intensities they need to be inserted into a
beadLevelData
object. The code below shows how to create a
new object and insert intensity values. However, this approach creates
an empty beadLevelData
object, which will be lacking any
information except that which the user manually inserts.
BLData <- new(Class = "beadLevelData")
BLData <- insertBeadData(BLData, array = 1, what = "Grn", data = finalIntensity)
An easier alternative to creating your own beadLevelData
object is to use the function readIllumina()
to read the
data as described in the main vignette. This ensures that any available
data (such as sample IDs, scanner metrics, grid sizes etc.) are read in
and stored. One can then choose to overwrite the values generated by
readIllumina()
, or store alternative intensities alongside
them.
The example below first reads the data using the standard arguments
to readIllumina()
, which will extract the intensities from
the .txt file. The second step overwrites those intensities with those
we calculated previously (which should be very similar). The final
command creates a new entry in the beadLevelData
object
(refered to as GrnLog
, that stores the log transform of the
values we calculated earlier. In this way the user can store a variety
of intensity values if they wish to experiment with alternative forms of
background subtraction, gradient removal etc.
The examples above have focused on applying the same intensity extraction algorithms that are employed by Illumina. However, one may wish to employ an alternative algorithm to test its performance. The example below implements an alternative method of calculating the background intensity values, as recommended by Smith et al.
We can then use the new background intensities in the same way as
previously, before inserting them into the beadLevelData
object.
We have included some support for parallel processing in the functions to perform sharpening of the image and the two background calculation methods. These can offer some increase in throughput when one is using a single computer to analyse a small number of samples. However if one is dealing with a large number of arrays then there are probably more efficient mechanisms to achieve speedup, such as reading separate chips on multiple machines R sessions and combining the data after they have been read.
This multicore support is implemented at the C level using the
OpenMP
library. Unfortunately adding support for this
generates a warning on Bioconductor, so support needs to be added
manually and the package build from source. The procedure is slightly
different for users on Linux and Windows machines.
Linux users should create a file called Makevars
in the
beadarray/src
directory and add the following two lines
before building the package from source.
PKG_CFLAGS=-fopenmp
PKG_LIBS=-lgomp
Windows users should create a file called Makevars.win
in the beadarray/src
directory and add the following two
lines before building the package from source.
PKG_CFLAGS=-fopenmp
PKG_LIBS=-lgomp -mthreads -lpthreadGC2
[1] Kuhn K, Baker SC, Chudin E, Lieu MH, Oeser S, Bennett H, Rigault P, Barker D, McDaniel TK, Chee MS, A novel, high-performance random array platform for quantitative gene expression profiling, Genome Research, 2004, 14(11):2347-56.
[2] Smith ML, Dunning MJ, Tavare S and Lynch AG, Identification and correction of previously unreported spatial phenomena using raw Illumina BeadArray data, BMC Bioinformatics, 2010, 11:208
\end{document}