Cliff Joslyn -- Research Works

Updated November, 2023

In my career as a research leader for the US Government, I have pursued a range of research in mathematical modeling of complex information systems, data science, and cybernetic philosophy; with applications in reliability analysis, computational biology, information warfare, cyber analytics, infrastructure protection, law enforcement, and distributed ledger technology. Below I detail my research areas, including all published papers.

Computational Topology and Hypergraph Analytics Knowledge-Informed Machine Learning and Neurosymbolic Computing
Applied Lattice Theory, Formal Concept Analysis, and Interval Computation Generalized Information Theory, Uncertainty Quantification, and Possibilistic Information Theory
Computational and Theoretical Biology, Bio-Ontologies, and Ontological Protein Function Annotation Semantic Technology, Ontology Metrics, and Knowledge Representation
Relational Data Modeling and Discrete Systems Cybernetic Philosophy, Computational Semiotics, and Evolutionary Systems Theory
Distributed Ledger Technology: Blockchain, Cryptocurrency, and Smart Contracts High Performance and Semantic Graph Database Analytics
Agent-Based and Discrete Event Modeling Cyber Analytics
Book Reviews and Other Works
Decision Support and Reliability Analysis
Principia Cybernetica:
Distributed Development of a Cybernetic Philosophy
HyperNetX (HNX):
Hypergraph Analytics for Python

Research Areas

Complex systems are essentially characterized by producing high dimensional data, which admit to a variety of mathematical interpretations. Hypergraphs are natural high dimensional generalizations of graphs which can attend to high order interactions amongst entities. Our Python package HyperNetX supports "hypernetwork science" extensions to network science methods like centrality, connectivity, and clustering. As finite set systems, hypergraphs are closely related to other structures like abstract simplicial complexes, all of which are interpretable as finite orders and finite topological spaces. Such topological objects admit to homological analysis to identify overall shape and structure, the most prominent method for these goals being the persistent homology of Topological Data Analysis (TDA).  When such topological structures as hypergraphs are equipped with data (either simple weights or more complex data types) and a logic of how data interact, then topological sheaves can model constraints across dimensional levels, facilitating canonical and provably necessary methods for heterogeneous information integration, assessing global consistency of information sources from local interactions.

Hierarchy is an inherent systems principle and concept, and is necessarily endemic in complex systems of all types. Mathematically, hierarchy is modeled by partial orders and lattices, so order theory generally is a central concern for complex systems. There are deep relations between orders, semantic information, and relational data structures, as reflected in semantic hierarchies and formal concept analysis. As hierarchies are systems admitting to descriptions  in terms of levels, interval representations are also a deeply related concept. Finally, in a finite context appropriate for data science, hypergraphs benefit greatly from being represented as set systems ordered by inclusion, and perhaps even more importantly, orders and topological spaces are equivalent, so that computational topology in particular is deeply wedded to lattice theory.

Any effort to model complex information systems must depend on a solid understanding of the mathematical foundations of information, and its conceptual sibling uncertainty. While for decades Shannon and Weaver's statistical entropy has provided the classical grounding of information theory, a broader range of mathematical methods is also available to provide needed generalization and richness. Careful relaxation of key axioms provides a range of non-monotonic measures representing a diverse collection of uncertainty semantics, extending beyond probability, randomness, and likelihood to include belief, possibility, precision, necessity, vagueness, nonspecificity, and plausibility. Sub-fields like monotone measures, Dempster-Shafer evidence theory, and fuzzy systems are the mathematical grounds for these approaches. My thesis established possibilistic systems theory as a generalization of stochastic methods based on empirical random sets, grounding possibilistic systems in measured random intervals. Applications include decision support systems and the simulations of large engineering and infrastructure systems.

"Information" in the sense of information theory is quantitative. While critical, this aspect is not concerned with semantics, or what the information can mean, or how it can be interpreted. Understanding semantic or symbolic information is a philosophically challenging consideration, closely related to computational linguistics and semiotics. Semantic technology has arisen in the context of Artificial Intelligence to formally represent levels of meaning and reference in systems.  Semantic processing, in this sense, requires meta-level coding of information tokens in terms of their semantic types, as formalized in semantic hierarchies or ontologies. Instance statements are predicates in the language of these semantic types, and are typically represented in ontology-labeled semantic graphs. Formal ontologies are complex objects which greatly benefit from mathematical analysis and formal representation. My work in approaches for modeling ontologies as lattices, and semantic graphs as ontologically-labeled directed hypergraphs, have proven very valuable in their management and analysis, for tasks like ontology alignment and ontological annotation of information sources.

When considering the range of complex information systems, biological systems stand out as the paradigm and the epitome. Organisms encompass a vast collection of numbers and types of very specifically interacting entities, supporting energetic regulation and control across a range of hierarchically organized levels, and encompassing the first appearance of information processing in evolutionary history, manifesting self-replication and the open-ended evolutionary of increasing complexity. The genomic revolution of the early 21st century then provided the basis for the computational analysis and modeling of this complexity, interacting with semantic technologies through the bio-ontology and systems biology movements. My work here has first focused on the mathematical analysis of bio-ontologies, especially for automated ontological protein function prediction; and then more recently on hypergraph analytics for multi-omic studies.

Systems theory, or systems science, is the transdisciplinary study of the abstract organization of phenomena, independent of their substance, type, or spatial or temporal scale of existence. As a post-war intellectual movement, it coupled closely with cybernetics as the science of control and communication in systems of all types. Together, systems science and cybernetics provided the first "science of complexity", and laid the groundwork for a range of future developments in computer science and mathematics, including complex adaptive systems, evolutionary systems, artificial intelligence, and artificial life. As fundamentally concerned with the nature of information in systems, cybernetics is thus also deeply involved with all aspects of semiotics as the science of signs and symbols and their interpretation, especially in domains beyond the cultural level. This includes biological semiotics as the foundation of biology, and computational semiotics as the foundation for AI. As a lifelong student of systems and cybernetics, and especially its mathematical and philosophical foundations, I have advanced core theories in fundamental hierarchy theory and complexity science, including evolutionary systems. I also co-founded Principia Cybernetica, a pioneering site from the dawn of the Internet age to develop a complete cybernetic philosophy and encyclopedia supported by collaborative computer technologies. 

While complex systems representations can take many forms, and complexity has many aspects, there is a canonical structure for them from discrete mathematics, which we can call a multi-relational system. These can be thought of as data tensors, basically, a collection of N dimensions of data, likely of different types, representable in an N-fold space, table, or or multi-dimensional arrays. Additionally, these dimensions or variables can have complex interactions, also thought of as dependencies or constraints. And where the data in each dimension may be totally explicated in detail, it is also common for there to be statistical distributions on dimensions instead. This overall mathematical structure yields a variety of special cases of interest, including graphs and hypergraphs. Also prominent are multivariate statistical and graphical models like Bayes nets, as well as OnLine Analytical Processing (OLAP) systems for handling complex relational data. My work has focused on the use of discrete mathematical tools for analyzing these multirelational systems, including tensors and lattices of set partitions and covers.

Hypergraph- and graph-structured data are increasingly prominent in massive data applications. Computational semantic systems are typically depoyed within the specific technologies of the Semantic Web paradigm, including the Resource Description Framework (RDF) for instance information, the Web Ontology Language (OWL) for ontological typing information, and SPARQL as a query language. The result is a semantic graph database (SGDB) or a property graph database as a graph-theoretical analog to a relational database, typically engineered as triple stores. SGDBs are being developed to massively scale, requiring new engineering models. In particular, "irregular memory" data objects like graphs and other combinatoric data structures require alternate memory models for high performance scaling. We have pursued a range of methods in a variety of HPC environments, including the Cray XMP and Cray's Chapel language, around both semantic and property graphs, and hypergraph data models at high scale.

Distributed cryptographic systems are revolutionizing the way that digital work and workflows are organized, promising a broad and sustained impact on the entire fabric of the digital space and information flows. While in the world of blockahin-enabled distributed ledgers, cryptocurrencies like Bitcoin remain the dominant technology and application, so-called "smart contract" systems can provide a general, cryptographically secure, general distribtued computing environment broadly capable of securely automating general workflows. Cryptocurrency transaction networks naturally take the form of directed hypergraphs, and we have modeled exchange patterns through directed hypgraph motifs. We are also pioneering applications of novel smart contract systems for nuclear safeguards and export control systems.

Along with organisms, cyber systems are another prime example of complex information systems. Our work brings a range of discrete mathematical techniques, including hypergraphs, computational topology, and interval and distributional analysis, to the study of cyber data, including Netflow, DNS, and malware catalogs.

  • Agent-Based and Discrete Event Modeling (papers)
  • Systems theory can be seen as fundamentally concerned with two concepts: 1) the nature of models, the mathematical relationships between classes of models, and how models can be transformed amongst those classes; and 2) the nature of agents as autonomous entities interacting with the world and each other to effect control relations for their survival. The concept of the semiotic agent joins these concepts, in the sense of an autonomous model-based control system, equipped, recurrently, with models of other, interacting, model-based control systems. In that context, my work has explored the role of possibilistic automata within the abstract universal modeling Discrete EVent Systems (DEVS) formalism; as well as studying the use of semiotic agent models in socio-technical organizations.

    One of the great values of a systems approach to data science is its ability to flexibly apply a diverse collection of methods to a range of applications to aid analysts and decision makers. In that context, I have worked to bring generalized uncertainty  quantification and information methods, including Dempster-Shafer evidence theory and possibilistic systems theory, to applications in engineering modeling and reliability analysis, decision support for critical infrastructure, and model-based diagnostics.

    Research Works

    Computational Topology and Hypergraph Analytics

    Knowledge-Informed Machine Learning and Neurosymbolic Computing

    Applied Lattice Theory, Formal Concept Analysis, and Interval Computation

    Generalized Information Theory, Uncertainty Quantification, and Possibilistic Information Theory

    Semantic Technology, Ontology Metrics, and Knowledge Representation

    Computational and Theoretical Biology, Bio-Ontologies, and Ontological Protein Function Annotation

    Cybernetic Philosophy, Computational Semiotics and Evolutionary Systems Theory

    Relational Data Modeling and Discrete Systems

    High Performance and Semantic Graph Database Analytics

    Distributed Ledger Technology: Blockchain, Cryptocurrency, and Smart Contracts

    Cyber Analytics

    Agent-Based and Discrete Event Modeling

    Decision Support and Reliability Analysis

    Book Reviews and Other Works