NICHOLAS J. MATIASZ/mə-tee-is/

NICHOLASMATIASZ @ GMAIL.COM

Professional Summary

  • Engineering generalist with formal training in both informatics and electrical engineering. Ten years of experience related to data analysis, data management, and web development. Doctoral-level skills in communication, causal reasoning, and system modeling.

Education

UCLA2013–18

Tufts University2006–12

Experience

UCLA2013–now

Graduate student researcher — Medical Informatics

Massachusetts General Hospital2012–13

Research technician — Neurology

  • Developed computing infrastructure and data analysis software
  • Designed and built electrode instrumentation for neuroscience research
  • Managed procurement of over $0.5M in scientific equipment

Tufts University2009–12

Research assistant — Electrical Engineering, Computer Science, & Psychology

Cavanaugh Tocci Associates Summer 2010

Acoustical consulting intern — Sound System Group

  • Created functional block diagrams for architectural drawings
  • Obtained on-site acoustical measurements and analyzed data

Allied Restoration Corp. Summers 2007–09

AutoCAD intern — Project Management Office

  • Designed field drawings and architectural details for roofing projects using AutoCAD
  • Revised construction documents in accordance with updated specifications
  • Organized and maintained project files and paperwork

Skills

Data

  • data analysis, data visualization, causal modeling

Communication

Software

Laboratory

  • EEG, MRI, machine tools, microscopy

Publications

  • N. J. Matiasz*, J. Wood*, W. Wang, A. J. Silva, W. Hsu (2018). Experiment selection in meta-analytic piecemeal causal discovery. In preparation.
  • N. J. Matiasz, A. J. Silva (2018). Quantifying the convergence of evidence. In submission.
  • J. Wood, N. J. Matiasz, A. J. Silva, A. Abyzov, W. Wang (2018). OpBerg: Discovering causal sentences using optimal alignments. In submission.
  • N. J. Matiasz (2018). Planning Experiments with Causal Graphs (Ph.D. thesis). UCLA. [abstract] [html] [pdf]
  • Scientists aim to design experiments and analyze evidence to obtain maximum knowledge. Although scientists have many statistical methods to guide how they analyze evidence, they have relatively few methods to quantify the convergence of evidence, to explore the full range of consistent causal explanations, and to design subsequent experiments on the basis of such analyses. The goal of this research is to establish tools that use graphical models to perform causal reasoning and experiment planning. This dissertation presents and evaluates methods that allow scientists (1) to quantify both the convergence and consistency of evidence, (2) to identify every causal structure that is consistent with evidence reported in literature, and (3) to design experiments that can efficiently reduce the number of viable causal structures. This suite of methods is demonstrated with real examples drawn from neuroscience literature.

    This dissertation shows how scientific results can be merged to yield new inferences by determining whether the results are consistent with various causal structures. Also presented is a Bayesian model of scientific consensus building, based on the principles of convergence and consistency. Together, these approaches form the basis of a mathematical framework that complements statistics: quantitative formalisms can be used not only to demonstrate each result’s significance but also to justify each experiment’s design.

  • N. J. Matiasz*, J. Wood*, P. Doshi*, W. Speier, B. Beckemeyer, W. Wang, W. Hsu, A. J. Silva (2018). ResearchMaps.org for integrating and planning research. In PLOS One 13:e0195271. [abstract] [code] [html] [pdf]
  • To plan experiments, a biologist needs to evaluate a growing set of empirical findings and hypothetical assertions from diverse fields that use increasingly complex techniques. To address this problem, we operationalized principles (e.g., convergence and consistency) that biologists use to test causal relations and evaluate experimental evidence. With the framework we derived, we then created a free, open-source web application that allows biologists to create research maps, graph-based representations of empirical evidence and hypothetical assertions found in research articles, reviews, and other sources. With our ResearchMaps web application, biologists can systematically reason through the research that is most important to them, as well as evaluate and plan experiments with a breadth and precision that are unlikely without such a tool.

  • J. I. Garcia-Gathright, N. J. Matiasz, C. Adame, K. V. Sarma, L. Sauer, N. F. Smedley, M. L. Spiegel, J. Strunck, E. B. Garon, R. K. Taira, D. R. Aberle, A. A. T. Bui (2018). Evaluating Casama: Contexualized semantic maps for summarization of lung cancer studies. In Computers in Biology and Medicine 92:55–63. [abstract] [html] [pdf]
  • Objective

    It is crucial for clinicians to stay up to date on current literature in order to apply recent evidence to clinical decision making. Automatic summarization systems can help clinicians quickly view an aggregated summary of literature on a topic. Casama, a representation and summarization system based on “contextualized semantic maps,” captures the findings of biomedical studies as well as the contexts associated with patient population and study design. This paper presents a user-oriented evaluation of Casama in comparison to a context-free representation, SemRep.

    Materials and Methods

    The effectiveness of the representation was evaluated by presenting users with manually annotated Casama and SemRep summaries of ten articles on driver mutations in cancer. Automatic annotations were evaluated on a collection of articles on EGFR mutation in lung cancer. Seven users completed a questionnaire rating the summarization quality for various topics and applications.

    Results

    Casama had higher median scores than SemRep for the majority of the topics (p ≤ 0.00032), all of the applications (p ≤ 0.00089), and in overall summarization quality (p ≤ 1.5e−05). Casama’s manual annotations outperformed Casama’s automatic annotations (p = 0.00061).

    Discussion

    Casama performed particularly well in the representation of strength of evidence, which was highly rated both quantitatively and qualitatively. Improvement of Casama’s automatic annotation methods is a first priority for future work.

    Conclusion

    This evaluation demonstrated the benefits of a contextualized representation for summarizing biomedical literature on cancer. Iteration on specific areas of Ca­sa­ma’s representation, further development of its algorithms, and a clinically-oriented evaluation are warranted.

  • N. J. Matiasz, J. Wood, W. Wang, A. J. Silva, W. Hsu (2017). Translating literature into causal graphs: Toward automated experiment selection. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM). [abstract] [html] [pdf]
  • Biologists synthesize research articles into coherent models—ideally, causal models, which predict how systems will respond to interventions. But it is challenging to derive causal models from articles alone, without primary data. To enable causal discovery using only literature, we built software for annotating empirical results in free text and computing valid explanations, expressed as causal graphs. This paper presents our meta-analytic pipeline: with the “research map” schema, we annotate results in literature, which we convert into logical constraints on causal structure; with these constraints, we find consistent causal graphs using a state-of-the-art, causal discovery algorithm based on answer set programming. Because these causal graphs show which relations are underdetermined, biologists can use this pipeline to select their next experiment. To demonstrate this approach, we annotated neuroscience articles and applied a “degrees-of-freedom” analysis for concisely visualizing features of the causal graphs that remain consistent with the evidence—a model space that is often too large for a machine to compute quickly, or for a researcher to examine exhaustively.
  • N. J. Matiasz, J. Wood, W. Wang, A. J. Silva, W. Hsu (2017). Computer-aided experiment planning toward causal discovery in neuroscience. In Frontiers in Neuroinformatics 11:12. [abstract] [html] [pdf]
  • Computers help neuroscientists to analyze experimental results by automating the application of statistics; however, computer-aided experiment planning is far less common, due to a lack of similar quantitative formalisms for systematically assessing evidence and uncertainty. While ontologies and other Semantic Web resources help neuroscientists to assimilate required domain knowledge, experiment planning requires not only ontological but also epistemological (e.g., methodological) information regarding how knowledge was obtained. Here, we outline how epistemological principles and graphical representations of causality can be used to formalize experiment planning toward causal discovery. We outline two complementary approaches to experiment planning: one that quantifies evidence per the principles of convergence and consistency, and another that quantifies uncertainty using logical representations of constraints on causal structure. These approaches operationalize experiment planning as the search for an experiment that either maximizes evidence or minimizes uncertainty. Despite work in laboratory automation, humans must still plan experiments and will likely continue to do so for some time. There is thus a great need for experiment-planning frameworks that are not only amenable to machine computation but also useful as aids in human reasoning.
  • J. I. Garcia-Gathright, N. J. Matiasz, E. B. Garon, D. R. Aberle, R. K. Taira, A. A. T. Bui (2016). Toward patient-tailored summarization of lung cancer literature. In Proceedings of the IEEE International Conference on Biomedical and Health Informatics (IEEE BHI). [abstract] [html] [pdf]
  • As the volume of biomedical literature increases, it can be challenging for clinicians to stay up-to-date. Graphical summarization systems help by condensing knowledge into networks of entities and relations. However, existing systems present relations out of context, ignoring key details such as study population. To better support precision medicine, summarization systems should include such information to contextualize and tailor results to individual patients.

    This paper introduces “contextualized semantic maps” for patient-tailored graphical summarization of published literature. These efforts are demonstrated in the domain of driver mutations in non-small cell lung cancer (NSCLC). A representation for relations and study population context in NSCLC was developed. An annotated gold standard for this representation was created from a set of 135 abstracts; F1-score annotator agreement was 0.78 for context and 0.68 for relations. Visualizing the contextualized relations demonstrated that context facilitates the discovery of key findings that are relevant to patient-oriented queries.

Conference Posters

  • N. J. Matiasz, W. Chen, A. J. Silva, W. Hsu (2016). MedicineMaps: a tool for mapping and linking evidence from experimental and clinical trial literature. Presented at 40th Annual Symposium of the American Medical Informatics Association (AMIA). [abstract] [demo] [pdf]
  • Introduction

    A significant barrier in the translation of clinical research has been the inability to effectively explore the large information space of published experiments to catalyze hypothesis generation and validation studies. Researchers need to become familiar with a body of research of increasing size and complexity: in 2013 alone, over 700 papers related to neurofibromatosis type 1 (NF1) were indexed on PubMed. Achieving an integrated understanding of a disease requires a shareable, machine-readable representation that not only captures high-level (causal) relationships but also incorporates relevant supporting evidence about the studies. Having a formal way to represent causal connections using semi-automated graphical and interactive tools would provide the research community with a map of what is known, unknown, and disputed, thus facilitating experiment planning.

    We present an open-source web application called MedicineMaps that facilitates clinical translational research by systematically and collaboratively capturing results of experimental studies and clinical trials reported in literature in a shareable, machine-readable way. We build upon the concept of research maps [1–3] to formalize causal relations based on a taxonomy of clinical experiments and rules for integrating evidence from multiple studies.

    Materials and methods

    At its highest level of abstraction, our representation includes three main entities: INTERVENTION (a treatment administered to a patient suffering from a condition), OUTCOME (an operationalized measure to assess a patient’s response to the intervention), and RELATION (a directed relationship between an intervention and an outcome that describes how the outcome is expected to change in response to the intervention). Details of each experiment are captured as properties of each relation, and a SCORE provides a heuristic measure to convey the amount of evidence represented. Each score is based on the convergence and consistency of the relevant experiments.

    Results

    MedicineMaps is implemented as a web application that assists target users (i.e., clinicians and researchers) with extracting and mapping information from papers to generate medicine maps. Functionality of this interface includes: (i) structured forms for entering details about the study; and (ii) visualization of semantic predications as a collection of nodes and edges. During the annotation of individual papers, experiments with specific intervention–outcome pairs can be inputted using a structured form that includes a variety of relevant fields to capture each study’s context. MedicineMaps is developed with Node.js, a JavaScript-based runtime environment for creating web applications. Given that the information is encoded as graphs, we use Neo4j 2.2.1, a NoSQL graph database. The database is queried using the Cypher Query Language (CQL) of Neo4j. Medicine maps are visualized using Cytoscape.js.

    References

    1. Landreth A, Silva AJ. The need for research maps to navigate published work and inform experiment planning. Neuron. 2013;79(3):411–5. [pdf]
    2. Silva AJ, Landreth A, Bickle J. Engineering the next revolution in neuroscience: The new science of experiment planning. Oxford: Oxford University;2014. [html]
    3. Silva AJ, Müller KR. The need for novel informatics tools for integrating and planning research in molecular and cellular cognition. Learning & Memory. 2015;22(9):494–8. [pdf]
  • N. J. Matiasz, J. Wood, W. Hsu, A. J. Silva (2016). ResearchMaps.org: A free web app for integrating and planning experiments. Presented at 15th Annual Molecular and Cellular Cognition Society (MCCS) Symposium. [abstract] [demo] [pdf]
  • Despite the growing abundance of structured and free-text information (e.g., PubMed), there is a great need for systems that help scientists integrate the large volume of relevant data and information required for experiment planning. Specifically, scientists need quantitative approaches that can be used to objectively assess the value of potential experimental designs, analogous to statistical methods used to objectively assess the significance of experimental results.

    To address this need, we developed ResearchMaps.org, a free web application that allows scientists to represent empirical results and hypothetical assertions in a quantitative, graphical form. The app uses a Bayesian learning framework to operationalize strategies that scientists routinely use to integrate findings and plan experiments, including convergence and consistency. By visualizing experimental results and quantifying experimental evidence, ResearchMaps allows scientists to formally convey the rationale behind their experiment-planning decisions. Computer-aided experiment planning systems and other “meta-scientific” tools like ResearchMaps will likely become increasingly common; such tools will enable scientists to use quantitative methods not only to establish the significance of their findings but also to justify the selection of the experiments themselves.

  • N. J. Matiasz, A. J. Silva, W. Hsu (2015). Synthesizing clinical trials for evidence-based medicine: A representation of empirical and hypothetical causal relations. Presented at 6th Annual Joint Summits on Translational Science: AMIA Summit on Clinical Research Informatics. [abstract] [pdf]
  • In medicine, clinical trials are the de facto method for measuring causal relations. A longstanding challenge is synthesizing this causal information for evidence-based medicine. We have developed a free web application, ResearchMaps, which uses a machine-interpretable format to record both empirical and hypothetical causal statements reported in clinical trial literature. Translational researchers can use our software’s graph-based interface to aggregate and interpret results from multiple clinical trials.

    In the clinical trial literature, a lot of knowledge remains unused simply because it exists as unstructured free-text. ResearchMaps gives machine-readable structure to this causal information. Our software is a step toward the automation of experiment planning and hypothesis generation. Grounding these processes in formal logic and automating them with software has the potential to revolutionize the scientific method.

    This poster presents:

    • how researchers can record and visualize causal relations
    • annotation guidelines for extracting causal information
    • preliminary annotation results for lung cancer literature
  • N. J. Matiasz, W. Hsu, A. J. Silva (2014). ResearchMaps.org, a free web application to track causal information in biology. Presented at 13th Annual Molecular and Cellular Cognition Society (MCCS) Symposium. [abstract] [demo] [pdf]
  • Although causal assertions are the very fabric of neuroscience, including molecular and cellular cognition (MCC), there are currently no tools to help researchers keep track of the increasingly complex network of causal connections derived from published findings. To address this growing problem, we developed a free web application (ResearchMaps) to build interactive maps of causal information derived from research papers. ResearchMaps is a collection of intertwined networks where the identity and properties of biological phenomena (the nodes in the networks) are linked by weighted causal connections (the edges in the networks). These edges represent one of three possible types of causal connections between two phenomena: 1) excitatory, 2) inhibitory or 3) no-connection. A score (from 0 to 1) assigned to each edge gives users a measure of the strength and consistency of the evidence represented by each connection between phenomena. Additionally, symbols inform users of the types of experiments represented in each edge. Although there are tens of millions of experiments testing causal connections in biology, they fall into a small number of classes. For example, molecular and cellular biologists commonly use at least four major types of experiments to test a possible causal connection between two entities (A and B): 1) Positive Manipulations, where A’s levels or activity are increased; 2) Negative Manipulations, where A’s levels or activity are decreased; 3) Non-Interventions whose goal is to track how A co-varies with B; and 4) Mediation experiments, designed to determine whether C is part of the mechanism by which A contributes to B. In ResearchMaps, convergency and consistency among results increase the score assigned to each edge, while contradictions have the opposite effect.

Awards & Funding

Presentations

  • Translating literature into causal graphs: Toward automated experiment selection (16 November 2017). IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM). [abstract] [slides]
  • Biologists synthesize research articles into coherent models—ideally, causal models, which predict how systems will respond to interventions. But it is challenging to derive causal models from articles alone, without primary data. To enable causal discovery using only literature, we built software for annotating empirical results in free text and computing valid explanations, expressed as causal graphs. This paper presents our meta-analytic pipeline: with the “research map” schema, we annotate results in literature, which we convert into logical constraints on causal structure; with these constraints, we find consistent causal graphs using a state-of-the-art, causal discovery algorithm based on answer set programming. Because these causal graphs show which relations are underdetermined, biologists can use this pipeline to select their next experiment. To demonstrate this approach, we annotated neuroscience articles and applied a “degrees-of-freedom” analysis for concisely visualizing features of the causal graphs that remain consistent with the evidence—a model space that is often too large for a machine to compute quickly, or for a researcher to examine exhaustively.
  • ResearchMaps.org for integrating evidence (27 October 2017). UCLA ICLM Young Investigator Lecture Series. [abstract]
  • ResearchMaps.org is a free web app that helps scientists to plan their next experiment. Users input empirical results and hypotheses from literature; the app visualizes this information in a graphical summary known as a “research map.” In this graph-based representation, each node identifies a biological phenomenon; each directed edge between nodes shows the kinds of relations that were either hypothesized by researchers or supported by empirical results. The empirical evidence for each edge is assigned a confidence score using a Bayesian technique for evidence synthesis. This score quantifies both the convergence and consistency of the evidence, helping the user to identify which next experiments may be most useful. Every empirical edge in a research map is linked to the literature that it references, so users can access additional details of the annotated literature. In ongoing work, we are working to automate two time-consuming tasks: (1) the extraction of empirical evidence from the literature, and (2) the derivation of hypotheses that may be untested yet logically consistent with what is known.
  • Building the brain of a robot scientist (4 May 2017). UCLA Career Development Conference.
  • ResearchMaps.org: Planning experiments by quantifying and visualizing empirical evidence and hypothetical assertions (28 April 2017). 2nd QCBio Symposium: Exploring the Frontiers of Biomedical Big Data. [abstract] [html] [video & transcript]
  • While many “big data” tools process the data that experiments yield, ResearchMaps helps scientists to plan their next experiment by navigating a different kind of “big data”: the enormous space of causal information and experimental designs that one might pursue. Users manually enter the empirical results and hypothetical assertions from research articles, which our interface visualizes in a graphical summary known as a “research map.” Nodes in the research map identify the phenomena that were studied; edges between nodes show the kinds of relations that were either supported by experiments or hypothesized by researchers. We apply a Bayesian approach to quantify both the convergence and consistency of the empirical evidence, helping the user to identify which new experiments may prove most instructive. In the graphical structure of a research map, every edge is linked to the article(s) it references, allowing the user to retrieve additional details of the annotated literature. In ongoing work, we are exploring how to automatically extract causal relations from literature and how to use causal graphs to automatically generate causal hypotheses that are untested yet consistent with the available evidence.
  • Building the brain of a robot scientist (25 April 2017). UCLA Grad Slam Finals. [video & transcript]