evo
介绍¶
Evo 2 是一种用于长上下文建模和设计的 DNA 语言模型,适用于基因组预测、序列生成和表观基因组设计等多种任务。官方开源了 1b、7b、40b 3个模型。
https://github.com/ArcInstitute/evo2
https://huggingface.co/arcinstitute
Genome modeling and design across all domains of life with Evo 2
安装¶
evo 依赖 TransformerEngine[pytorch]==1.13.0
,预编译的 TransformerEngine 需要 glibc 版本大于 2.28(系统为centos7,glib版本较低),因此这里先编译 TransformerEngine。
$ git clone https://github.com/NVIDIA/TransformerEngine.git
$ cd TransformerEngine
$ git checkout v1.13
$ git submodule update --init --recursive
$ pip3 install --prefix=/public/home/software/opt/bio/software/evo2/0.2.0 torch
# 依赖 cudnn >= 9.3.0,C++17
$ module load cudnn/9.4.0_cuda12 GCC/9.4.0
$ CUDNN_PATH=/public/home/software/opt/bio/software/cudnn/9.4.0_cuda12
$ CC=/public/home/software/opt/bio/software/GCC/9.4.0/bin/gcc
$ CXX=/public/home/software/opt/bio/software/GCC/9.4.0/bin/g++
$ NVTE_FRAMEWORK=pytorch pip3 install --no-build-isolation --prefix=/public/home/software/opt/bio/software/evo2/0.2.0 .
$ export HF_ENDPOINT=https://hf-mirror.com
$ pip3 install --prefix=/public/home/software/opt/bio/software/evo2/0.2.0 evo2
测试¶
# 使用国内的 hf 镜像下载模型,首次使用会下载模型至 ~/.cache/huggingface/hub/
$ export HF_ENDPOINT=https://hf-mirror.com
# 下载模型
$ huggingface-cli download arcinstitute/evo2_1b_base
# 设置使用的显卡,显存需要大于20G
$ export CUDA_VISIBLE_DEVICES=1
$ python -m evo2.test.test_evo2_generation --model_name evo2_1b_base
Fetching 4 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 313.79it/s]
Found complete file in repo: evo2_1b_base.pt
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 42.59it/s]
Extra keys in state_dict: {'blocks.3.mixer.dense._extra_state', 'blocks.24.mixer.dense._extra_state', 'blocks.17.mixer.dense._extra_state', 'unembed.weight', 'blocks.10.mixer.dense._extra_state'}
/public/home/software/opt/bio/software/evo2/0.2.0/lib/python3.11/site-packages/evo2/test/test_evo2_generation.py:22: DeprecationWarning: path is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
with resources.path('evo2.test.data', input_file) as data_path:
Initializing inference params with max_seqlen=3768
/public/home/software/opt/bio/software/evo2/0.2.0/lib/python3.11/site-packages/vortex/model/engine.py:559: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally at /pytorch/aten/src/ATen/native/Copy.cpp:308.)
inference_params.state_dict[layer_idx] = state[..., L - 1].to(dtype=state_dtype)
Prompt: "GAATAGGAACAGCTCCGGTCTACAGCTCCCAGCGTGAGCGACGCAGAAGACGGTGATTTCTGCATTTCCATCTGAGGTACCGGGTTCATCTCACTAGGGAGTGCCAGACAGTGGGCGCAGGCCAGTGTGTGTGCGCACCGTGCGCGAGCCGAAGCAGGGCGAGGCATTGCCTCACCTGGGAAGCGCAAGGGGTCAGGGAGTTCCCTTTCCGAGTCAAAGAAAGGGGTGATGGACGCACCTGGAAAATCGGGTCACTCCCACCCGAATATTGCGCTTTTCAGACCGGCTTAAGAAACGGCGCACCACGAGACTATATCCCACACCTGGCTCAGAGGGTCCTACGCCCACGGAATCTCGCTGATTGCTAGCACAGCAGTCTGAGATCAAACTGCAAGGCGGCAACGAGGCTGGGGGAGGGGCGCCCGCCATTGCCCAGGCTTGCTTAGGTAAACAAAGCAGCCGGGAAGCTCGAACTGGGTGGAGCCCACCACAGCTCAAGGAGGCCTGCCTGCCTCTGTAGGCTCCACCTCTGGGGGCAGGGCACAGACAAACAAAAAGGCAGCAGTAACCTCTGCAGACTTAAGTGTCCCTGTCTGACAGCTTTGAAGAGAGCAGTGGTTCTCCCAGCACGCAGCTGGAGATCTGAGAACGGGCAGACTGCCTCCTCAAGTGGGTCCCTGACCCCTGACCCCCGAGCAGCCTAACTGGGAGGCACCCCCCAGCAGGGGCACACTGACACCTCACACGGCAGGGTATTCCAACAGACCTGCAGCTGAGGGTCCTGTCTGTTAGAAGGAAAACTAACAACCAGAAAGGACATCTACACCGAAAACCCATCTGTACATCACCATCATCAAAGACCAAAAGTAGATAAAACCACAAAGATGGGGAAAAAACAGAACAGAAAAACTGGAAACTCTAAAACGCAGAGCGCCTCTCCTCCTCCAAAGGAACGCAGTTCCTCACCAGCAACAGAACAAAGCTGGATGGAGAATGATTTTGACGAGCTGAGAGAAGAAGGCTTCAGACGATCAAATTACTCTGAGCTACGGGAGGACATTCAAACCAAAGGCAAAGAAGTTGAAAACTTTGAAAAAAATTTAGAAGAATGTATAACTAGAATAACCAATACAGAGAAGTGCTTAAAGGAGCTGATGGAGCTGAAAACCAAGGCTCGAGAACTACGTGAAGAATGCAGAAGCCTCAGGAGCCGATGCGATCAACTGGAAGAAAGGGTATCAGCAATGGAAGATGAAATGAATGAAATGAAGCGAGAAGGGAAGTTTAGAGAAAAAAGAATAAAAAGAAATGAGCAAAGCCTCCAAGAAATATGGGACTATGTGAAAAGACCAAATCTACGTCTGATTGGTGTACCTGAAAGTGATGTGGAGAATGGAACCAAGTTGGAAAACACTCTGCAGGATATTATCCAGGAGAACTTCCCCAATCTAGCAAGGCAGGCCAACGTTCAGATTCAGGAAATACAGAGAACGCCACAAAGATACTCCTCGAGAAGAGCAACTCCAAGACACATAATTGTCAGATTCACCAAAGTTGAAATGAAGGAAAAAATGTTAAGGGCAGCCAGAGAGAAAGGTCGGGTTACCCTCAAAGGGAAGCCTATCAGACTAACAGCAGATCTCTCGGCAGAAACCCTACAAGCCAGAAGAGAGTGGGGGCCAATATTCAACATTCTTAAAGAAAAGAATTTTCAACCCAGAATTTCATTTCCAGCCAAACTAAGCTTCATAAGTGAAGGAGAAAGAAAATACTTTACAGACAAGCAAATGCTGAGAGATTTTGTCACCACCAGGCCTACCCTAAAAGAGCTCCTGAAGGAAGCACTAAACATGGAAAGGAACAACCGGTACCAGCCGCTGCAAAATCATGCCAAAATGTAAAGACCATCGAGACTAGGAAGAAACTGCATCAACTAATGAGCAAAATCACCAGCTAACATCATAATGACAGGATCAAATTCACACATAACAATATTAACTTTAAATATAAATGGACTAAATTCTGCAATTAAAAGACACAGACTGGCAAGTTGGATAAAGAGTCAAGACCCATCAGTGTGCTGTATTCAGGAAACCCATCTCATGTGCAGAGACACACATAGGCTCAAAATAAAAGGATGGAGGAAGATCTACCAAGCAAATGGAAAACAAAAAAGGCAGGGGTTGCAATCCTAGTCTCTGATAAAACAGACTTTAAACCAACAAAGATCAAAAGAGACAAAGAAGGCCATTACATAATGGTAAAGGGATCAATTCAACAAGAGGAGCTAACTATCCTAAATATTTATGCACCCAATACAGGAGCACCCAGATTCATAAAGCAAGTCCTGAGTGACCTACAAAGAGACTTAGACTCCCACACATTAATAATGGGAGACTTTAACACCCCACTGTCAATATTAGACAGATCAACGAGACAGAAAGTCAACAAGGATACCCAGGAATTGAACTCAGCTCTGCACCAAGCAGACCTAATAGACATCTACAGAACTCTCCACCCCAAATCAACAGAATATACATTTTTTTCAGCACCACACCACACCTATTCCAAAATCGACCACATAGTTGGAAGTAAAGCTCTCCTCAGCAAATGTAAAAGAACAGAAATTATAACAAACTATCTCTCAGACCACAGTGCAATCAAACTAGAACTCAGGATTAAGAATCTCACTCAAAGCCGCTCAACTACATGGAAACTGAACAACCTGCTCCTGAATGACTACTGGGTACATAACGAAATGAAGGCAGAAATAAAGATGTTCTTTGAAACCAACGAGAACAAAGACACCACATACCAGAATCTCTGGGACGCATTCAAAGCAGTGTGTAGAGGGAAATTTATAGCACTAAATGCCTACAAGAGAAAGCAGGAAAGATCCAAAATTGACACCCTAACATCACAATTAAAAGAACTAGAAAAGCAAGAGCAAACACATTCAAAAGCTAGCAGAAGGCAAGAAATAACTAAAATCAGAGCAGAACTGAAGGAAATAGAGACACAAAAAACCCTTCAAAAAATCAATGAATCCAGGAGCTGGTTTTTTGAAAGGATCAACAAAATTGATAGACCGCTAGCAAGACTAATAAAGAAAAAAAGAGAGAAGAATCAAATAGACACAATAAAAAATGATAAAGGGGATATCACCACCGATCCCACAGAAATACAAACTACCATCAGAGAATACTACAAACACCTCTACGCAAATAAACTAGAAAATCTAGAAGAAATGGATACATTCCTCGACACA", Output: "TACACCCTCCCAAGACTAAACCAGGAAGAAGTTGAATCCCTGAATAGACCAATAACAGGCTCTGAAATTGAGGCAATAATTAATAGCCTACCAACCAAAAAAAGTCCAGGACCAGATGGATTCACAGCCGAATTCTACCAGAGGTACAAGGAGGAGCTGGTACCATTCCTTCTGAAACTATTCCAATCAATAGAAAAAGAGGGAATCCTCCCTAACTCATTTTATGAGGCCAGCATCATCCTGATACCAAAGCCTGGCAGAGACACAACAAAAAAAGAGAATTTTAGACCAATATCCCTGATGAACATGGATGCAAAAATCCTCAATAAAATACTAGCAAACCGAATACAACAGCACATCAAAAAGATCATCCACCATGATCAAGTGGGCTTCATCCCTGGGATGCAAGGCTGGTTCAATATACGCAAATCAATAAATGTAATCCAGCATATAAACAGAACCAAAGACAAAAACCACATGATTATCTCAATAGATGCAGA", Score: -0.02793017588555813
Initializing inference params with max_seqlen=4028
Prompt: "GACACCATCGAATGGCGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAACAACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCCGATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGCGTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCAGTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGACCAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGACTGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTTAGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGGCATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAATGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCGCTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGGATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATATCCCGCCGTCAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCAGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCAATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGGATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCCCATCTACACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTTTGTTCCCACGGAGAATCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCAGACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTGGGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACGCGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGTTGGAGTGACGGCAGTTATCTGGAAGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACCGACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCAGCCGCGCTGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGTTTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCTTTCGGCGGTGAAATTATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCCGAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGCCGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGAGGTGCGGATTGAAAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCTGATGAAGCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGCTGTGGTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCACGGCATGGTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTACCGGCGATGAGCGAACGCGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCTGGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCTGGATCAAATCTGTCGATCCTTCCCGCCCGGTGCAGTATGAAGGCGGCGGAGCCGACACCACGGCCACCGATATTATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATGGTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCCTTTGCGAATACGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCAGTATCCCCGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATATGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGATCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCGCATCCAGCGCTGACGGAAGCAAAACACCAGCAGCAGTTTTTCCAGTTCCGTTTATCCGGGCAAACCATCGAAGTGACCAGCGAATACCTGTTCCGTCATAGCGATAACGAGCTCCTGCACTGGATGGTGGCGCTGGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACAGTTGATTGAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGCTCACAGTACGCGTAGTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCCGCATCTGACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTGCTGACGCCGCTGCGCGATCAGTT", Output: "TATTCGCGCGCCGACCGATAACGATATCGGCATTAGCGAAGCGCAGCGCATCAACCTGTGGCGCGCGGCGGGTAACTATGCGCTGCGCAATCACATCGAAACCGTGGCGGCGGAAGAGGGCGATGATGTGCTGGTGACGGTGACGACGCGCTTCGCGCCGCCGTTCCTGAATTTCAATATGACCACCACCTGGACGGTGTATGCCGATGGCGAAATCACCGTCAACACCACGGTGACGCCGCTGGCGGATCTGCCGCCGCTGCCGCGCGTCGGCATGACGCTGCGCCTGCCGGAAGCGTTTAACCAGGTGGAATGGTTCGGCCGCGGCCCGCACGAAAACTACTGCGATCGCCGCCACGCCGCGCAGGTGGGCGTGTATCAGAGCACGGTGGCGGATATGTACGAGCCGTATGTGCGCCCGCAGGAGAACGGCAACCGCACCGATGTGCGCTGGGTGACGCTGACCAACGCCGAAGGCTTTGGCCTGCGCGTGGTGGGCG", Score: -0.38358744978904724
Initializing inference params with max_seqlen=3580
Prompt: "GTTAATGTAGCTTAAAACAAAAGCAAGGTACTGAAAATACCTAGACGAGTATATCCAACTCCATAAACAACAAAGGTTTGGTCCCGGCCTTCTTATTGGTTACTAGGAAACTTATACATGCAAGTATCCGCCCGCCAGTGAATACGCCTTCTAAATCATCACTGATCAAAGAGAGCTGGCATCAAGCACACACCCCAAGTGTAGCTCATGACGTCTCGCCTAGCCACACCCCCACGGGAAACAGCAGTAGTAAATATTTAGCAATTAACAAAAGTTAGACTAAGTTATCCTAATAAAGGACTGGTCAATTTCGTGCCAGCAACCGCGGCCATACGATTAGTCCAAATTAATAAGCATACGGCGTAAAGCGTATTAGAAGAATTAAAAAAATAAAGTTAAATCTTATACTAGCTGTTTAAAGCTCAAGATAAGACATAAATAGCCTACGAAAGTGACTTTAATAATCCTAAACATACGATAGCTAGGGTACAAACTGAGATTAGATACCTCACTATGCCTAGCCCTAAACTTTGATAGCTACCTTTACAAAGCTATCCGCCAGAGAACTACTAGCCAGAGCTTAAAACTTAAAGGACTTGGCGGTGCTTTATATCCACCTAGGGGAGCCTGTCTCGTAACCGATGAACCCCGATACACCTTACCGTCACTTGCTAATTCAGTCCATATACCACCATCTTCAGCAAACCCCTATAGGGCACAAAAGTGAGCTTAATCATAACCCATGAAAAAGTTAGGCCGAGGTGTCGCCTACGTGACGGTCAAAGATGGGCTACATTTTCTATTATAGAATAGACAAACGGATACCACTCTGAAATGGGTGGTTGAAGGCGGATTTAGTAGTAAACTAAGAATAGAGAGCTTAATTGAACAAGGCCATGAAGCGCGTACACACCGCCCGTCACTCTCCTCAAGTACCTCCACATCAAACAATCATATTACAGATTTAAACAAATACAAGAGGAGACAAGTCGTAACAAGGTAAGCGTACTGGAAAGTGTGCTTGGGTAACTCAAAGTGTAGCTTAACAAAAAGCATCTGGCTTACACCTAGAAGACCTCATTCACAATGATCACTTTGAACTAAATCTAGCCCTACCAACCTTACACCCAACTCTCACACTACATTAAATTAAAACATTCATTTATCAAAAAGTATAGGAGATAGAAATTTCACTAAGGCGCAATAGAGATAGTACCGCAAGGGAATGATGAAAGATAATTTAATAGTAAAAAATAGCAAGGATTAACCCCTTTACCTTTTGCATAATGAATTAACTAGAAAAATCTGACAAAGAGAACTACAGCCAGAAACCCCGAAATCAGACGAGCTATCTGATAGTAATCCCCAGGATCAATTCATCTATGTGGCAAAATAGTGAAAAAACTTACAGATAGAGGTGAAATACCAATCGAGCCTGATGATAGCTGGTTGTCCAGAAATAGAATTTCAGTTCTACCTAAAACTTACCACAAAAACAAAATAATTCCAATGTAAGTTTTAGAGATATTCAAAAGGGGTACAGCTCTTTTGACCAAGGATACAACCTTGATTAGCGAGTAAATTCACCATTAATTTCATAGTTGGCTTGGAAGCAGCCATCAATTAAGAAAGCGTTAAAGCTCAACAACCAACCAAACTAAAAAATCCCAAGAATTAATTAATGATCTCCTAAACATAATACTGGACTAATCTATATAAATAGAAGAAATAATGTTAGTATAAGTAATAAGAAGTATTTCTCCCTGCATAAGCTTATATCAGATCGGATGCCCACTGATAGTTAACAATCAAATAATTAAATACAAAAATAAAACCTTTATTACACCAATTGTTAACCCAACACAGGCATGCTTAAGGGAAAGATTAAAAGAAGGAAAAGGAACTCGGCAAACATAAACCCCGCCTGTTTACCAAAAACATCACCTCGAGCATTACTAGTATTCGAGGCACTGCCTGCCCAGTGACCAAGTGTTAAACGGCCGCGGTACTCTGACCGTGCAAAGGTAGCATAATCATTTGTTCCTTAATTAGGGACTTGTATGAACGGCCACACGAGGGTTTAACTGTCTCTTTCCTCTAATCAATGAAATTGACCTTCTCGTGAAGAGGCGAGAATAAACATATAAGACGAGAAGACCCTATGGAGCTTAAATTAACTAATTTAATTGCTATCCTATAAATCTACAAGATACAACTAAACAGCATAATAAATTAACAATTTTGGTTGGGGTGACCTCGGAGAAGAAAAAAACCTCCGAACGATATTATAATTCAGACTTTACAAGTCAAGATTCACTAATCGCTTATTGACCCAATACTTGATCAACGGAACAAGTTACCCTAGGGATAACAGCGCAATCCTACTCTAGAGTCCCTATCGACAGCAGGGTTTACGACCTCGATGTTGGATCAGGACATCCTAATGGTGCAGCCGCTATTAAGGGTTCGTTTGTTCAACGATTAAAGTCCTACGTGATCTGAGTTCAGACCGGAGCAATCCAGGTCGGTTTCTATCTATAGTTTATTTATTCCAGTACGAAAGGACAGAAAAAATGAGGCCAATCTTACCAAGACGCCTTCAGCTAAATTTATGAATAAATCTCAATCTAGATAAGCTAAACCACCCAATCCAAGAACAGGATTTGTTAAGATAGCAAAAATTGGTTACTGCATAAAACTTAAGCTTTTACTTACGGAGGTTCAACTCCTCTTCTTAACAATGTTCTTGATTAATGTCCTAACAGTAACCTTGCCTATCCTTCTAGCAGTAGCCTTCCTCACCTTAGTTGAACGAAAGGCCTTAGGCTACATACAACTTCGTAAAGGCCCCAATGTAGTAGGACCCTACGGTCTTCTTCAACCTATCGCAGATGCAATCAAGCTATTTACCAAAGAACCCGTCTATCCACAAACCTCCTCAAAATTCCTATTTACCATTGCCCCAATTCTAGCCCTAACCTTAGCCCTAACTGTATGAGCTCCTCTTCCAATACCATATCCCCTAATTAACTTAAATCTAAGCCTATTATTTATTCTCGCAATATCAAGTCTGATAGTTT", Output: "ACTCAATCCTATGATCAGGATGAGCATCAAACTCAAAATACTCACTAATTGGAGCCCTACGAGCAGTAGCCCAAACAATCTCATATGAAGTAACCCTAGCCATTATTCTACTATCAATTATCCTAATAAATGGATCATTCACCCTATCAACACTAATTATTACCCAAGAACAAATATGACTAATTTTCCCAGCATGACCACTAGCAATAATATGATTTATCTCAACACTAGCAGAAACAAACCGAGCCCCATTTGACCTAACAGAAGGAGAATCAGAACTAGTATCAGGATTTAATGTAGAATATGCAGCAGGACCATTCGCCCTATTCTTCATAGCAGAATACGCAAACATCATCATAATAAACATCCTAACAACAATCCTATTCTTAGGAGCATTCCACAACCCAATAATACCAGAACTATACACAATCAACTTCACAATTAAAACCCTACTACTAACAACATCATTCCTATGAATTCGAGCATCATACCCACGATTC", Score: -0.20910809934139252
Initializing inference params with max_seqlen=4308
Prompt: "GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCCTATTATTTATCGCACCTACGTTCAATATTACAGGCGAACATACTTACTAAAGTGTGTTAATTAATTAATGCTTGTAGGACATAATAATAACAATTGAATGTCTGCACAGCCACTTTCCACACAGACATCATAACAAAAAATTTCCACCAAACCCCCCCTCCCCCGCTTCTGGCCACAGCACTTAAACACATCTCTGCCAAACCCCAAAAACAAAGAACCCTAACACCAGCCTAACCAGATTTCAAATTTTATCTTTTGGCGGTATGCACTTTTAACAGTCACCCCCCAACTAACACATTATTTTCCCCTCCCACTCCCATACTACTAATCTCATCAATACAACCCCCGCCCATCCTACCCAGCACACACACACCGCTGCTAACCCCATACCCCGAACCAACCAAACCCCAAAGACACCCCCCACAGTTTATGTAGCTTACCTCCTCAAAGCAATACACTGAAAATGTTTAGACGGGCTCACATCACCCCATAAACAAATAGGTTTGGTCCTAGCCTTTCTATTAGCTCTTAGTAAGATTACACATGCAAGCATCCCCGTTCCAGTGAGTTCACCCTCTAAATCACCACGATCAAAAGGAACAAGCATCAAGCACGCAGCAATGCAGCTCAAAACGCTTAGCCTAGCCACACCCCCACGGGAAACAGCAGTGATTAACCTTTAGCAATAAACGAAAGTTTAACTAAGCTATACTAACCCCAGGGTTGGTCAATTTCGTGCCAGCCACCGCGGTCACACGATTAACCCAAGTCAATAGAAGCCGGCGTAAAGAGTGTTTTAGATCACCCCCTCCCCAATAAAGCTAAAACTCACCTGAGTTGTAAAAAACTCCAGTTGACACAAAATAGACTACGAAAGTGGCTTTAACATATCTGAACACACAATAGCTAAGACCCAAACTGGGATTAGATACCCCACTATGCTTAGCCCTAAACCTCAACAGTTAAATCAACAAAACTGCTCGCCAGAACACTACGAGCCACAGCTTAAAACTCAAAGGACCTGGCGGTGCTTCATATCCCTCTAGAGGAGCCTGTTCTGTAATCGATAAACCCCGATCAACCTCACCACCTCTTGCTCAGCCTATATACCGCCATCTTCAGCAAACCCTGATGAAGGCTACAAAGTAAGCGCAAGTACCCACGTAAAGACGTTAGGTCAAGGTGTAGCCCATGAGGTGGCAAGAAATGGGCTACATTTTCTACCCCAGAAAACTACGATAGCCCTTATGAAACTTAAGGGTCGAAGGTGGATTTAGCAGTAAACTAAGAGTAGAGTGCTTAGTTGAACAGGGCCCTGAAGCGCGTACACACCGCCCGTCACCCTCCTCAAGTATACTTCAAAGGACATTTAACTAAAACCCCTACGCATTTATATAGAGGAGACAAGTCGTAACATGGTAAGTGTACTGGAAAGTGCACTTGGACGAACCAGAGTGTAGCTTAACACAAAGCACCCAACTTACACTTAGGAGATTTCAACTTAACTTGACCGCTCTGAGCTAAACCTAGCCCCAAACCCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTACCCAAATAAAGTATAGGCGATAGAAATTGAAACCTGGCGCAATAGATATAGTACCGCAAGGGAAAGATGAAAAATTATAACCAAGCATAATATAGCAAGGACTAACCCCTATACCTTCTGCATAATGAATTAACTAGAAATAACTTTGCAAGGAGAGCCAAAGCTAAGACCCCCGAAACCAGACGAGCTACCTAAGAACAGCTAAAAGAGCACACCCGTCTATGTAGCAAAATAGTGGGAAGATTTATAGGTAGAGGCGACAAACCTACCGAGCCTGGTGATAGCTGGTTGTCCAAGATAGAATCTTAGTTCAACTTTAAATTTGCCCACAGAACCCTCTAAATCCCCTTGTAAATTTAACTGTTAGTCCAAAGAGGAACAGCTCTTTGGACACTAGGAAAAAACCTTGTAGAGAGAGTAAAAAATTTAACACCCATAGTAGGCCTAAAAGCAGCCACCAATTAAGAAAGCGTTCAAGCTCAACACCCACTACCTAAAAAATCCCAAACATATAACTGAACTCCTCACACCCAATTGGACCAATCTATCACCCTATAGAAGAACTAATGTTAGTATAAGTAACATGAAAACATTCTCCTCCGCATAAGCCTGCGTCAGATTAAAACACTGAACTGACAATTAACAGCCCAATATCTACAATCAACCAACAAGTCATTATTACCCTCACTGTCAACCCAACACAGGCATGCTCATAAGGAAAGGTTAAAAAAAGTAAAAGGAACTCGGCAAATCTTACCCCGCCTGTTTACCAAAAACATCACCTCTAGCATCACCAGTATTAGAGGCACCGCCTGCCCAGTGACACATGTTTAACGGCCGCGGTACCCTAACCGTGCAAAGGTAGCATAATCACTTGTTCCTTAAATAGGGACCTGTATGAATGGCTCCACGAGGGTTCAGCTGTCTCTTACTTTTAACCAGTGAAATTGACCTGCCCGTGAAGAGGCGGGCATAACACAGCAAGACGAGAAGACCCTATGGAGCTTTAATTTATTAATGCAAACAGTACCTAACAAACCCACAGGTCCTAAACTACCAAACCTGCATTAAAAATTTCGGTTGGGGCGACCTCGGAGCAGAACCCAACCTCCGAGCAGTACATGCTAAGACTTCACCAGTCAAAGCGAACTACTATACTCAATTGATCCAATAACTTGACCAACGGAACAAGTTACCCTAGGGATAACAGCGCAATCCTATTCTAGAGTCCATATCAACAATAGGGTTTACGACCTCGATGTTGGATCAGGACATCCCGATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAAAGTCCTACGTGATCTGAGTTCAGACCGGAGTAATCCAGGTCGGTTTCTATCTACNTTCAAATTCCTCCCTGTACGAAAGGACAAGAGAAATAAGGCCTACTTCACAAAGCGCCTTCCCCCGTAAATGATATCATCTCAACTTAGTATTATACCCACACCCACCCAAGAACAGGGTTTGTTAAGATGGCAGAGCCCGGTAATCGCATAAAACTTAAAACTTTACAGTCAGAGGTTCAATTCCTCTTCTTAACAACATACCCATGGCCAACCTCCTACTCCTCATTGTACCCATTCTAATCGCAATGGCATTCCTAATGCTTACCGAACGAAAAATTCTAGGCTATATACAACTACGCAAAGGCCCCAACGTTGTAGGCCCCTACGGGCTACTACAACCCTTCGCTGACGCCATAAAACTCTTCACCAAAGAGCCCCTAAAACCCGCCACATCTACCATCACCCTCTACATCACCGCCCCGACCTTAGCTCTCACCATCGCTCTTCTACTATGAACCCCCCTCCCCATACCCAACCCCCTGGTCAACCTCAACCTAGGCCTCCTATTTATTCTAGCCACCTCTAGCCTAGCCGTTTACTCAATCCTCTGATCAGGGTGAGCATCAAACTCAAACTACGCCCTGATCGGCGCACTGCGAGCAGTAGCCCAAACAATCTCATATGAAGTCACCCTAGCCATCATTCTACTATCAACATTACTAATAAGTGGCTCCTTTAACCTCTCCACCCTTATCACAA", Output: "CCCAAGAACACCTATGACTAATCTTCCCCTCATGACCCCTAGCCATAATATGATTTATCTCCACACTAGCAGAAACCAACCGAGCCCCATTCGACCTCACAGAAGGAGAATCAGAACTAGTCTCAGGCTTCAACGTAGAATACGCCGCAGGCCCATTCGCCCTATTCTTCCTAGCAGAATACGCCAACATCATACTAATAAACACACTAACAACCATCCTATTCCTAAACCCAAGCTTCCTAAACCCCCCACAAGAACTATTCCCAATCATCCTAGCCACAAAAACCCTACTACTATCCTCAGGCTTCCTATGAATCCGATCCTCATACCCACGATTCCGATACGACCAACTAATACACCTCCTATGAAAAAACTTCCTACCACTAACACTAGCACTATGCCTATGACACATCAGCATACCAATCTGCACAGCAGGCATCCCCCCTTACCTAAGGAAATGTGCCTGAACGCAAAGGACCACTATGATAAAGTGAACATAG", Score: -0.15997366607189178
Test Results:
% Matching Nucleotides: 68.05
Test Passed! Score matches expected 68.0%
调用集群 evo2¶
# 集群上已下载好了模型文件,可直接使用。如果需要使用其它路径下的模型文件,可用环境变量 HF_HOME 指定。
$ module load evo2/0.2.0-py3.11
# 测试,可选的模型 evo2_1b_base evo2_7b evo2_40b
$ python -m evo2.test.test_evo2_generation --model_name evo2_1b_base
本站总访问量 次