Show simple item record

dc.contributor.authorSlater, Karin
dc.contributor.authorWilliams, John A
dc.contributor.authorKarwath, Andreas
dc.contributor.authorFanning, Hilary
dc.contributor.authorBall, Simon
dc.contributor.authorSchofield, Paul N
dc.contributor.authorHoehndorf, Robert
dc.contributor.authorGkoutos, Georgios V
dc.date.accessioned2024-09-06T15:13:21Z
dc.date.available2024-09-06T15:13:21Z
dc.date.issued2021-09-27
dc.identifier.citationSlater K, Williams JA, Karwath A, Fanning H, Ball S, Schofield PN, Hoehndorf R, Gkoutos GV. Multi-faceted semantic clustering with text-derived phenotypes. Comput Biol Med. 2021 Nov;138:104904. doi: 10.1016/j.compbiomed.2021.104904. Epub 2021 Sep 27en_US
dc.identifier.issn0010-4825
dc.identifier.eissn1879-0534
dc.identifier.doi10.1016/j.compbiomed.2021.104904
dc.identifier.pmid34600327
dc.identifier.urihttp://hdl.handle.net/20.500.14200/5669
dc.description.abstractIdentification of ontology concepts in clinical narrative text enables the creation of phenotype profiles that can be associated with clinical entities, such as patients or drugs. Constructing patient phenotype profiles using formal ontologies enables their analysis via semantic similarity, in turn enabling the use of background knowledge in clustering or classification analyses. However, traditional semantic similarity approaches collapse complex relationships between patient phenotypes into a unitary similarity scores for each pair of patients. Moreover, single scores may be based only on matching terms with the greatest information content (IC), ignoring other dimensions of patient similarity. This process necessarily leads to a loss of information in the resulting representation of patient similarity, and is especially apparent when using very large text-derived and highly multi-morbid phenotype profiles. Moreover, it renders finding a biological explanation for similarity very difficult; the black box problem. In this article, we explore the generation of multiple semantic similarity scores for patients based on different facets of their phenotypic manifestation, which we define through different sub-graphs in the Human Phenotype Ontology. We further present a new methodology for deriving sets of qualitative class descriptions for groups of entities described by ontology terms. Leveraging this strategy to obtain meaningful explanations for our semantic clusters alongside other evaluation techniques, we show that semantic clustering with ontology-derived facets enables the representation, and thus identification of, clinically relevant phenotype relationships not easily recoverable using overall clustering alone. In this way, we demonstrate the potential of faceted semantic clustering for gaining a deeper and more nuanced understanding of text-derived patient phenotypes.en_US
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.relation.urlhttp://www.sciencedirect.com/science/journal/00104825en_US
dc.rightsCopyright © 2021 The Author(s). Published by Elsevier Ltd.. All rights reserved.
dc.subjectHuman physiologyen_US
dc.subjectGeneticsen_US
dc.titleMulti-faceted semantic clustering with text-derived phenotypes.en_US
dc.typeArticleen_US
dc.source.journaltitleComputers in Biology and Medicineen_US
dc.source.volume138
dc.source.beginpage104904
dc.source.endpage
dc.source.countryUnited Kingdom
dc.source.countryUnited States
rioxxterms.versionNAen_US
dc.contributor.trustauthorKarwath, Andreas
dc.contributor.trustauthorFanning, Hilary
dc.contributor.trustauthorBall, Simon
dc.contributor.departmentResearch and Developmenten_US
dc.contributor.departmentNephrologyen_US
dc.contributor.roleMedical and Dentalen_US
oa.grant.openaccessnaen_US


Files in this item

Thumbnail
Name:
Publisher version

This item appears in the following Collection(s)

Show simple item record