Data-driven modeling and machine learning have opened new paradigms and opportunities in the understanding and design of soft and biological materials. The automated discovery of emergent collective variables within high-dimensional computational and experimental data sets provides a means to understand and predict materials behavior and engineer properties and function. I will describe our recent work in the use of two machine learning techniques for collective variable discovery within molecular simulation – nonlinear manifold learning using diffusion maps, and nonlinear dimensionality reduction using autoencoding neural networks ("autoencoders"). First, I will describe our applications of graph matching and diffusion maps to determine low-dimensional assembly landscapes for self-assembling patchy colloids. These landscapes connect colloid architecture and prevailing conditions with emergent assembly behavior, and we use them to perform inverse building block design by rationally sculpting the landscape to engineer the stability and accessibility of desired aggregates. Second, I will describe our use of autoencoders to perform automated discovery of collective variables in protein folding. We interleave deep learning variable discovery and enhanced sampling directly within the discovered variables to perform simultaneous on-the-fly variable discovery and accelerated sampling of protein folding funnels.