Tokenized and Continuous Embedding Compressions of Protein Sequence and Structure

Developments The authors show they "can construct a tokenized all-atom structure vocabulary that retains high reconstruction accuracy, thus introducing a tokenized representation of all-atom structure that can be obtained from sequence alone". They use a Compressed Hourglass Embedding Adaptations of Proteins (CHEAP) toe represent protein structure of sequence and structure with significant embedding compression. image image

Share link! 📋
Link copied!
See the main site!