News article: The Geometry of Data

Aerial top-down view of a dense green forest canopy with sunlight filtering through the treetops.

The Geometry of Data - Why Machine Learning Needs Signatures

February 2026

Research

This blogpost was first published in the European Consortium for Mathematics in Industry (ECMI) webpage:

Link to ECMI blog

Introduction

Think of scribbling a digit on a digital tablet. You might write the number '3' slowly, with hesitations, or quickly in a single burst. To a computer analyzing the raw chronological data stream, these two actions look completely different. Yet, the geometric shape -- the path created -- remains identical. This simple example highlights a fundamental challenge in modern machine learning. While state-of-the-art systems perform impressively on static data, they often struggle when data becomes irregular, highly structured, or subject to time-warping. The bottleneck is rarely the learning algorithm itself, but rather the representation of the data. To achieve robustness against distortions like re-sampling or rotation, we need features that respect the underlying geometry of the signal.

Over the last decade, signature methods have emerged as a powerful solution to this problem, particularly for sequential and temporal data [ChevKorm2026, Xie2018]. Based on the seminal work of Kuo-Tsai Chen [chen1957, chen1958] and developed substantially within rough path theory by Terry Lyons and collaborators [FrizHairer2020, Lyons2007], path signatures provide a mathematically principled description of a path that is invariant to reparameterization. Most recently, these ideas have expanded beyond time series, inspiring novel approaches for learning from images and high-dimensional structured signals [Chevyrevetal2025, Diehl2025].

From data streams to signatures

Think of a data stream not as a static spreadsheet of numbers, but as a pen drawing a continuous line in space. Whether it is the tick-by-tick fluctuation of a stock price, the vital signs of a patient, or the movement of the stylus on a digital tablet, the data trace a path.

A challenge in data science is the construction of computational representations of paths that retain the information relevant to inference and learning. Simply recording the position of the pen at every millisecond creates a massive, noisy vector. Calculating simple statistics like the mean or variance destroys the order of events -- telling one where the data was, but not how it evolved.

This is where the path signature becomes relevant. Rather than taking a snapshot of the data, it provides a representation that captures the geometry of the entire path. It summarizes the path by a graded, i.e., leveled sequence of numbers (computed via Chen's iterated integrals) that describe the shape of the trajectory with increasing precision:

Level 1 (displacement): How far did the path travel from start to finish? This captures the net increment.
Level 2 (area): Did the path curve left or right? This term calculates the signed area enclosed by the path and its chord, capturing the order of events (e.g., did the stock price drop before volatility spiked, or after?).
Higher levels (volume & twist): These terms capture complex, 3D interactions and fine-grained details of the trajectory.

From an algebraic perspective, these numerical features organize naturally into a "tensor series". However, for machine learning, the power of the signature lies in two specific properties that solve common data headaches:

It is a geometric fingerprint (Uniqueness): Under mild assumptions, the signature uniquely characterizes the path up to tree-like equivalence [LyonsHambly2010]. This means that it distinguishes paths up to segments that are traversed back and forth (i.e., motion that cancels itself by reversal).
It is speed blind (Invariance): The signature depends only on the sequence of events, not the speed at which they happen. Writing a "3" slowly or quickly produces the exact same signature. This makes it incredibly robust for comparing signals that evolve at different rates.

Foamy ocean waves breaking on a sandy beach.

Signatures as feature maps

In a typical machine learning pipeline, raw data rarely goes directly into a model. It needs a translator. Signatures act as a "universal adapter," transforming complex, messy streams into a format that are suitable for algorithms.

Raw stream → Signature → Fixed vector → Model → Prediction

The key idea lies in the truncation of the signature. A raw data stream might have 100 points or 10,000 points; it might be sampled every millisecond or have gaps of several days. Standard neural networks struggle with this variability. However, if we truncate the signature at a specific level (say, level 4), we get a fixed-length vector of geometric features that describes the path regardless of its original length or sampling rate.

This fixed-dimensional representation is crucial for its practical applicability. It permits to feed highly irregular data into standard algorithms without messy pre-processing like padding or cropping. In practice, signature features are so rich that they often allow simple, transparent models (like linear regression) to achieve results that previously required deep, opaque neural networks. But the story goes deeper. There is a profound connection between signatures and the cutting edge of Deep Learning: Neural Controlled Differential Equations (Neural CDEs) [kidger2020neural, morrill2021neural].

Many modern Recurrent Neural Networks (RNNs) are essentially trying to solve a differential equation where the input data controls the evolution of the system. From this viewpoint, the path signature isn't just a feature; it provides the canonical coordinates for this control. It establishes a rigorous link between classical control theory and modern AI, enabling principled control of neural networks driven by irregular data.

The signature is based on computing (Chen's) iterated integrals along continuous paths capturing the geometry of a signal between sampling points. In practice, however, data rarely comes in this "idealized" form. Indeed, time series are typically observed as discrete streams such as transactions, ticks, sensor readings, or event logs. This naturally motivates the notion of discrete signature, as proposed in [Diehl2020, Diehl2023]. It addresses the discrete nature of the data by replacing iterated integrals with iterated sums. This construction preserves the algebraic structure and expressive power of signature methods, while being invariant under time-warping. We refer to recent work exploring discrete signatures in applications [DiehlSchmitz2026, DIEHL2026132884].

Beyond time: the challenge of images

Although path signatures have proven highly effective for time series, extending them to images presents a fundamental geometric challenge. A time series has natural chronological order. An image, however, is a spatial field: a function of two variables (coordinates in the plane) taking values in a vector space, typically representing color channels.

Standard approaches often "flatten" an image into a one-dimensional sequence to apply sequential models. However, this process imposes an artificial order and disrupts the local geometry, separating pixels that are spatially adjacent. This loss of structure leads to the central question of our recent research:

Can the path signature framework be generalized from one-dimensional signals to higher-dimensional data such as images?

The motivation for a signature for images is threefold. From a machine learning perspective, it offers a path toward rotation-invariant features that respect the intrinsic geometry of the data. In statistics, it promises higher-order descriptors for random fields, generalizing the method of moments. Finally, in analysis, such signatures could provide the necessary tools to solve partial differential equations driven by rough, irregular noise.

These challenges were addressed during the academic year 2023–2024 within the Signatures for Images (SFI) project, supported by the Centre for Advanced Study (CAS) in Oslo. The project brought together researchers in algebra and stochastic analysis, and the resulting collaborations led to the development of new theoretical frameworks for the analysis of high-dimensional data, as detailed in [Chevyrevetal2025, Diehl2025].

Outlook

This research highlights the fruitful interaction between pure mathematics, including stochastic analysis, statistics, geometry, and algebra, and modern data science. It shows that effective feature extraction does not need to rely on black-box methods. Instead, the use of well-understood mathematical structures yields representations that respect the intrinsic geometry of the data and integrate transparently with learning algorithms.

Looking ahead, these principles are central to the mission of the SURE AI -- Centre for Sustainable, Risk-averse and Ethical AI, one of Norway's national AI research centres. In an era where explainability is paramount, the mathematical transparency of signatures offers a promising foundation for the next generation of reliable, risk-aware machine learning. We are actively expanding these methods to new domains, believing that the most robust AI solutions will be built on the solid ground of rigorous mathematics.

Check out the full paper for details:

Inference as Risk Identification

[CarPat21] Cartier, P., & Patras, F. (2021). Classical Hopf algebras and their applications (Vol. 29). Springer, Cham.

[chen1957] Chen, K.-T. (1957). Integration of paths, geometric invariants and a generalized Baker-Hausdorff formula. Annals of Mathematics. Second Series, 65, 163–178.

[chen1958] Chen, K.-T. (1958). Integration of paths -- a faithful representation of paths by non-commutative formal power series. Transactions of the American Mathematical Society, 89, 395–407.

[ChevKorm2026] Chevyrev, I., & Kormilitzin, A. (2026). A Primer on the Signature Method in Machine Learning. In Signature Methods in Finance (pp. 3–64). Springer, Cham.

[Chevyrevetal2025] Chevyrev, I., Diehl, J., Ebrahimi-Fard, K., & Tapia, N. (2024). A multiplicative surface signature through its Magnus expansion. arXiv.

[Diehl2020] Diehl, J., Ebrahimi-Fard, K., & Tapia, N. (2020). Time-warping invariants of multidimensional time series. Acta Applicandae Mathematicae, 170, 265–290.

[Diehl2023] Diehl, J., Ebrahimi-Fard, K., & Tapia, N. (2023). Generalized iterated-sums signatures. Journal of Algebra, 632, 801–824.

[Diehl2025] Diehl, J., Ebrahimi-Fard, K., Harang, F. N., & Tindel, S. (2025). On the signature of an image. Stochastic Processes and their Applications, 187, 104661.

[DIEHL2026132884] Diehl, J., Ibraheem, R., Schmitz, L., & Wu, Y. (2026). Tensor-to-tensor models with fast iterated sum features. Neurocomputing, 675, 132884.

[DiehlSchmitz2026] Diehl, J., & Schmitz, L. (2026). Two-parameter sums signatures and corresponding quasisymmetric functions. Journal of Algebraic Combinatorics. An International Journal, 63(1), Paper No. 4, 61.

[FrizHairer2020] Friz, P. K., & Hairer, M. (2020). A course on rough paths (2nd ed.). Springer, Cham.

[LyonsHambly2010] Hambly, B., & Lyons, T. (2010). Uniqueness for the signature of a path of bounded variation and the reduced path group. Annals of Mathematics. Second Series, 171(1), 109–167.

[kidger2020neural] Kidger, P., Morrill, J., Foster, J., & Lyons, T. (2020). Neural Controlled Differential Equations for Irregular Time Series. Advances in Neural Information Processing Systems, 33, 6696–6707.

[Lyons2007] Lyons, T. J., Caruana, M., & Lévy, T. (2007). Differential Equations Driven by Rough Paths: École d’Été de Probabilités de Saint-Flour XXXIV - 2004. Springer Berlin Heidelberg.

[morrill2021neural] Morrill, J., Salvi, C., Kidger, P., & Foster, J. (2021). Neural Rough Differential Equations for Long Time Series. International Conference on Learning Representations.

[Xie2018] Xie, Z., Sun, Z., Jin, L., Ni, H., & Lyons, T. (2018). Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(8), 1903–1917.