Tissue diagnostics and biomarker analytics are the keystones of cancer discovery research. However, delivering on the promise of personalized medicine requires multiple data sources to be integrated and analyzed. Management and analysis of large volumes of tissue samples are crucial to realizing the potential of personalized medicine.
To accomplish this, imaging and pathology informatics support will be essential to moving forward in an increasingly complex and multifaceted medical research environment.
There are several problems that a research platform needs to address if digital pathology is to prove an effective tool for drug and biomarker discovery studies.
- Shared archive: Virtual Slides – high quality images produced using a scanner – are typically several gigabytes in size. In the past, sharing these image files with colleagues has proven problematic due to their size and the lack of IT infrastructure supporting fast sharing and viewing of these images. Sharing tissue image archives across centers encourages multisite collaboration.
- Image file formats: In research some users need a viewer that can read many different image file formats. Pathologists and lab managers therefore can be trained on one viewing software platform.
- Multi-modality data management: Collating and organizing data from multiple data sources brings its own challenges. Clinical histories, diagnostic classes, clinico-pathological data, image analysis data, molecular profiling, genomic and epidemiological data may all be stored in different files and across various locations. Spreadsheets, CSV/TSV files, LIMS, and in-house databases and applications are incapable of managing the quantity and variety of data that needs to be captured as part of a typical study.
- Image analytics integration: Image analysis tools can quantify and qualify tissue cells and cell structures in a rapid and consistent manner. However, a range of image analysis applications are available – from commercial vendors, as well as in-house and open source solutions. Managing slides that are used in multiple applications and the data produced by their algorithms can be difficult, due to limited interoperability.
- TMA management: Tissue microarrays (TMAs) provide the means for high-throughput analysis of multiple tissues and cells. The issues involved in collating, organizing and associating data with whole sections of tissue is multiplied with a TMA, as a single TMA slide may contain 200 cores, each potentially representing a different patient. A TMA study of a single 20x10 block, with five stains, three scoring criteria per stain and two reviewing pathologists will result in six thousand ‘scores’. Mapping these scores to different patients, comparing scores and results, and identifying trends across different cohorts will be time-consuming and prone to human error. Data mining tools, detailed in the next section, are required to solve the challenges of conducting large biomarker investigative studies.
- Cross study data management: Given the range and volume of data mentioned in the points above, and the high quantity of slides that may be part of a typical study, the organizing slides in different ways, and maintaining a link with the aforementioned data, will prove challenging. Slides may need to be included as part of several studies. Studies may also be organized in many different ways.