Discrete Latent Structure in Neural Networks

Niculae, Vlad; Corro, Caio; Nangia, Nikita; Mihaylova, Tsvetomila; Martins, André F. T.

doi:10.1561/2000000134

Article navigation

Research Article| June 30 2025

Discrete Latent Structure in Neural Networks

Vlad Niculae;

Vlad Niculae

1Language Technology Lab, Informatics Institute, Faculty of Science, University of Amsterdam

,

Netherlands

Search for other works by this author on:

This Site

PubMed

Google Scholar

Caio Corro;

Caio Corro

INSA Rennes, IRISA, Inria, CNRS, Université de Rennes

,

France

Search for other works by this author on:

This Site

PubMed

Google Scholar

Nikita Nangia;

Nikita Nangia

Amazon

,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Tsvetomila Mihaylova;

Tsvetomila Mihaylova

Department of Electrical Engineering and Automation, Aalto University

,

Finland

Search for other works by this author on:

This Site

PubMed

Google Scholar

André F. T. Martins

Instituto Superior Técnico

,

Portugal

Instituto de Telecomunicações

,

Portugal

Unbabel

,

Portugal

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Online ISSN: 1932-8354

Print ISSN: 1932-8346

2025

V. Niculae et al.

Licensed re-use rights only

Foundations and Trends in Signal Processing (2025) 19 (2): 99–211.

https://doi.org/10.1561/2000000134

Many types of data from fields including natural language processing, computer vision, and bioinformatics are well represented by discrete, compositional structures such as trees, sequences, or matchings. Latent structure models are a powerful tool for learning to extract such representations, offering a way to incorporate structural bias, discover insight about the data, and interpret decisions. However, effective training is challenging as neural networks are typically designed for continuous computation.

This text explores three broad strategies for learning with discrete latent structure: continuous relaxation, surrogate gradients, and probabilistic estimation. Our presentation relies on consistent notations for a wide range of models. As such, we reveal many new connections between latent structure learning strategies, showing how most consist of the same small set of fundamental building blocks, but use them differently, leading to substantially different applicability and properties.

2025

V. Niculae et al.

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

Discrete Latent Structure in Neural Networks

Email Alerts

Cited By

Discrete Latent Structure in Neural Networks

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable