- Time-resolved biomarker discovery in 1H-NMR data using generalized fuzzy Hough transform alignment and parallel factor analysis.
Time-resolved biomarker discovery in 1H-NMR data using generalized fuzzy Hough transform alignment and parallel factor analysis.
This work addresses the subject of time-series analysis of comprehensive (1)H-NMR data of biological origin. One of the problems with toxicological and efficacy studies is the confounding of correlation between the administered drug, its metabolites and the systemic changes in molecular dynamics, i.e., the flux of drug-related molecules correlates with the molecules of system regulation. This correlation poses a problem for biomarker mining since this confounding must be untangled in order to separate true biomarker molecules from dose-related molecules. One way of achieving this goal is to perform pharmacokinetic analysis. The difference in pharmacokinetic time profiles of different molecules can aid in the elucidation of the origin of the dynamics, this can even be achieved regardless of whether the identity of the molecule is known or not. This mode of analysis is the basis for metabonomic studies of toxicology and efficacy. One major problem concerning the analysis of (1)H-NMR data generated from metabonomic studies is that of the peak positional variation and of peak overlap. These phenomena induce variance in the data, obscuring the true information content and are hence unwanted but hard to avoid. Here, we show that by using the generalized fuzzy Hough transform spectral alignment, variable selection, and parallel factor analysis, we can solve both the alignment and the confounding problem stated above. Using the outlined method, several different temporal concentration profiles can be resolved and the majority of the studied molecules and their respective fluxes can be attributed to these resolved kinetic profiles. The resolved time profiles hereby simplifies finding true biomarkers and bio-patterns for early detection of biological conditions as well as providing more detailed information about the studied biological system. The presented method represents a significant step forward in time-series analysis of biological (1)H-NMR data as it provides almost full automation of the whole data analysis process and is able to analyze over 800 unique features per sample. The method is demonstrated using a (1)H-NMR rat urine dataset from a toxicology study and is compared with a classical approach: COW alignment followed by bucketing.