Albatross starts from a restrained question: if a model trained only to recover missing nucleotides is perturbed at one position, which other positions change their predictions? In structured IRES RNAs, those dependencies often trace the molecule's hidden folded structure. Here is how the signal appears — and where it quietly breaks.
Picornaviruses — polio, the common cold, hepatitis A — can't start translation the normal way. Instead, a long folded stretch of RNA at the front of their genome, an IRES, grabs the ribosome by its shape. Shape is everything; the same letters folded differently behave differently.
The authors measured how strongly 96 of these IRESes drive translation, across six human and animal cell types. The newest class (Type V) drives roughly twice the output of EMCV, the element everyone uses in gene therapy, and most IRESes are tissue-specific. Rich behavior — which demands a structural explanation. The problem: we have experimentally solved structures for only a handful.
The gold standard for finding pairs without a microscope is covariation: line up many related sequences and look for pairs of positions that mutate together to stay complementary. It works beautifully — when you have a good alignment of many close relatives.
Picornaviral IRESes are notoriously divergent. For the strangest ones you might find only a handful of relatives, and no clean alignment at all. Covariation simply has nothing to work with. That is the gap.
Take a model trained on one job only: given an RNA sequence with a position hidden, guess the missing letter. To do that well across millions of sequences, it has to internalize how positions constrain each other. Base-paired positions constrain each other the most.
So we run a tiny experiment, position by position. Mutate one nucleotide. Re-run the model. Ask: whose prediction changed? The position that reacts most is, overwhelmingly, the base-pairing partner. Try it — click a nucleotide below.
Do this for every position and you get an N×N grid — the dependency map. Cell (i, j) is bright when mutating i swings the prediction at j. The diagonal is a position reacting to itself, so it's masked out. Everything interesting lives off the diagonal.
Here's the visual key to everything that follows. In a stem, the first base pairs with the last, the second with the second-to-last, and so on. As one index climbs, its partner's index descends. Plotted on the map, that traces a short line running across the diagonal — an antidiagonal. Each antidiagonal stripe is one stem. Watch a stripe fold into a stem:
This is the part worth dwelling on. The base model, RiNALMo, already knew general RNA — but run a dependency map on an IRES and you get noise. Fine-tune the very same architecture on ~50,000 IRES sequences (still just letters, still just masked-token guessing) and the antidiagonals snap into focus. The model that emerges is Albatross.
Nothing about pairing, geometry, or thermodynamics was ever added. Only more of the right sequences. Drag the slider to fine-tune.
Turning the map into an actual list of pairs uses a filter with no biology in it at all — only geometry: ignore weak signal below a threshold, drop the un-pairable band next to the diagonal, keep only stripes long enough to be a real stem, then match each position to at most one partner. No base-pairing rules are enforced, which lets it propose non-canonical pairs.
The threshold is the dial that matters. Raise it and the model only commits to pairs it's sure of: precision soars, recall falls. That trade-off is the method's signature — and a thing to keep honest about. Move the slider:
Now the comparison that matters. Bin every IRES by how many similar relatives exist to align. When relatives are plentiful, covariation (here, CaCoFold) keeps up. When they vanish — the regime that actually stumps biologists — covariation collapses and Albatross holds. Pick a bin:
With four relatives there's no covariation to exploit, so Albatross must be doing something else. It has learned that certain little sequence words imply certain folds. A classic one is the GNRA tetraloop — see that word, expect the stem that follows it.
To prove it, break the word. Mutate the motif and the prediction degrades; delete it and the stem vanishes entirely. Try each:
One more result points the way forward. Fine-tune three sizes — 33M, 150M, 650M parameters — on Type I IRESes only, then test on every type. Larger models find more structure, as expected. The striking part: they also get better on types they never trained on. The rules it learns generalize. Switch sizes: