Фреймворк для эффективного прогнозирования и оптимизации белковой активности с помощью глубокого обучения

«`html

µFormer: A Deep Learning Framework for Efficient Protein Fitness Prediction and Optimization

Protein engineering is essential for designing proteins with specific functions, but navigating the complex fitness landscape of protein mutations poses a significant challenge, making it hard to find optimal sequences. Zero-shot approaches, which predict mutational effects without relying on homologs or multiple sequence alignments (MSAs), reduce some dependencies but fall short in predicting diverse protein properties. Learning-based models trained on deep mutational scanning (DMS) or MAVE data have been used to predict fitness landscapes alone or with MSAs or language models. Still, these data-driven models often struggle when experimental data is sparse.

µFormer Approach

Microsoft Research AI for Science researchers introduced µFormer, a deep learning framework that integrates a pre-trained protein language model with specialized scoring modules to predict protein mutational effects. µFormer predicts high-order mutants, models epistatic interactions, and handles insertions. With reinforcement learning, µFormer efficiently explores vast mutant spaces to design enhanced protein variants. The model predicted mutants with a 2000-fold increase in bacterial growth rate, driven by improved enzymatic activity. µFormer’s success extends to challenging scenarios, including multi-point mutations and its predictions were validated through wet-lab experiments, highlighting its potential for optimizing protein design.

The µFormer model operates in two stages: first, by pre-training a masked protein language model (PLM) on a large dataset of unlabeled protein sequences, and second, by predicting fitness scores using three scoring modules integrated into the pre-trained model. These modules—residual-level, motif-level, and sequence-level—capture different aspects of the protein sequence and combine their outputs to generate the final fitness score. The model is trained using known fitness data, minimizing errors between predicted and actual scores.

Additionally, the µFormer is combined with a reinforcement learning (RL) strategy to explore the vast space of possible mutations efficiently. The protein engineering problem in this framework is modeled as a Markov Decision Process (MDP), with Proximal Policy Optimization (PPO) used to optimize mutation policies. Dirichlet noise is added during the mutation search process to ensure effective exploration and avoid local optima. Baseline comparisons were made using models like ESM-1v and ECNet, and they were evaluated on datasets such as FLIP and ProteinGym.

µFormer, a hybrid model combining a self-supervised protein language model with supervised scoring modules, predicts protein fitness scores efficiently. Pre-trained on 30 million protein sequences from UniRef50 and fine-tuned with three scoring modules, µFormer outperformed ten methods in the ProteinGym benchmark, achieving a mean Spearman correlation of 0.703. It predicts high-order mutations and epistasis, with strong correlations for multi-site mutations. In protein optimization, µFormer, paired with reinforcement learning, designed TEM-1 variants that significantly improved growth, with one double mutant outperforming a known quadruple mutant.

In conclusion, Previous studies have shown the potential of sequence-based protein language models in tasks like enzyme function prediction and antibody design. µFormer, a sequence-based model with three scoring modules, was developed to generalize across diverse protein properties. It achieved state-of-the-art performance in fitness prediction tasks, including complex mutations and epistasis. µFormer also demonstrated its ability to optimize enzyme activity, particularly in predicting TEM-1 variants against cefotaxime. Despite its success, improvements can be made by incorporating structural data, developing phenotype-aware models, and creating models capable of handling longer protein sequences for better accuracy.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and LinkedIn. Join our Telegram Channel.

Применение ИИ в бизнесе

Если вы хотите, чтобы ваша компания развивалась с помощью искусственного интеллекта (ИИ) и оставалась в числе лидеров, грамотно используйте µFormer: A Deep Learning Framework for Efficient Protein Fitness Prediction and Optimization .

Проанализируйте, как ИИ может изменить вашу работу. Определите, где возможно применение автоматизации: найдите моменты, когда ваши клиенты могут извлечь выгоду из AI.

Определитесь какие ключевые показатели эффективности (KPI): вы хотите улучшить с помощью ИИ.

Подберите подходящее решение, сейчас очень много вариантов ИИ. Внедряйте ИИ решения постепенно: начните с малого проекта, анализируйте результаты и KPI.

На полученных данных и опыте расширяйте автоматизацию.

Если вам нужны советы по внедрению ИИ, пишите нам на https://t.me/itinai . Следите за новостями об ИИ в нашем Телеграм-канале https://t.me/aisalesbotnews

Попробуйте AI Sales Bot https://saile.ru/ Это AI ассистент для продаж, он помогает отвечать на вопросы клиентов, генерировать контент для отдела продаж, снижать нагрузку на первую линию.

Узнайте, как ИИ может изменить процесс продаж в вашей компании с решением от saile.ru будущее уже здесь!

«`

saile.ru • ИИ в продажах

Фреймворк для эффективного прогнозирования и оптимизации белковой активности с помощью глубокого обучения

µFormer: A Deep Learning Framework for Efficient Protein Fitness Prediction and Optimization

µFormer Approach

Применение ИИ в бизнесе

Бесплатный ИИ: для автоматизации продаж

Как сократить цикл пресейла: ИИ предложит оптимизацию демо, техобоснований и согласований

Как рассчитать идеальную цену для новых товаров: ИИ применит эластичность и сравнит с конкурентами

Как составить отчет по воронке продаж с комментариями: ИИ визуализирует этапы и предложит интерпретации

Как измерить эффективность акций и скидок: ИИ определит прирост, каннибализацию и ROI

Как вести себя при потоке клиентов: ИИ составит алгоритм из 4 шагов на перегруженной точке

Как закрыть сделку сразу в момент разговора: ИИ предложит 3 фразы-дожима без давления

Как организовать автоворонку прогрева лида: ИИ создаст email-цепочку и контент-логику

Как отработать возражение “дорого” за 1 фразу: ИИ предложит 5 формулировок для front-line сотрудников

Как не “свалиться” в презентацию вместо диалога: ИИ составит структуру вопросов на выявление боли

Как внедрить скрипт продаж под текущую воронку: ИИ создаст структуру звонка с возражениями

Как построить SEO-ядро для блога: ИИ подберет 30 ключевых слов по поисковым запросам ЦА

Как перераспределить нагрузку между менеджерами: ИИ проанализирует воронку и укажет узкие места

Умные продажи

Искусственный интеллект для кодирования в терминале: Plandex

Улучшение моделей линейного внимания для эффективной обработки языка: внимание с ограниченным доступом.

Стартап в области радиологии из Германии, разработавший ведущую операционную систему ИИ для радиологов

Университет Висконсина-Мэдисон: улучшение устойчивости нулевого обучения с помощью ROBOSHOT

Google Cloud предоставляет доступ к TPUs для пользователей HuggingFace

Фреймворк для подписи видео на основе экспертов, превосходящий GPT-4V и Gemini-Pro-1.5 в различных видеосценах, автономном вождении и робототехнике

Инструмент для обучения и использования маленьких языковых моделей: TinyAgent.

Отказ от ответственности

Пресс-релизы

Политика комментариев

Партнеры

Карта сайта

Подписка