Published in Nature Scientific Data, this paper presents a comprehensive dataset of predicted protein structures for 42,042 distinct human proteins, generated using NVIDIA’s BioNeMo platform combined with Innophore’s CavitomiX technology. The dataset integrates predictions from AlphaFold 2, OpenFold, and ESMFold into a single, quality-controlled resource — representing the most complete structural coverage of the human proteome available for machine learning purposes. It is offered in both unedited and refined formats to support diverse applications including structure-based drug design and protein function prediction.

Read the full article at Nature Scientific Data

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>