mii
HPC 集群上一般使用 Module 管理各类基础软件和用户应用软件,当 Module 管理的软件非常多且复杂时,Module 使用起来就不是那么友好。mii 是一款 Module 的自动搜索引擎,用于辅助 Module 的使用,让用户使用 Module 更方便。
项目地址:https://github.com/codeandkey/mii
相关文章 Mii: An Automated Search Engine for Environment Modules
安装¶
编译
git clone https://github.com/codeandkey/mii
cd mii
make
PREFIX=~/mii/ make install
# 142集群,执行完下面的命令后重开一个 shell 窗口
echo "source /public/home/software/opt/bio/software/mii/1.1.2/share/mii/init/bash" >> ~/.bashrc
# 180集群,执行完下面的命令后重开一个 shell 窗口
echo "source /share/software/app/arm/mii/1.1.2/share/mii/init/bash" >> ~/.bashrc
创建索引¶
第一次使用需要创建 Module 索引,第一次创建时间稍长。其原理是将所有 Modulefile 文件中 $PATH
路径中的程序名及其 Module 名建立索引。索引路径 ~/.mii/index
。
$ mii build
[15:33:20] INFO Finished analysis on 1435 modules
后续使用过程中,若系统 Module 有更新,只需使用 sync 更新索引即可,相比 mii build
,运行时间非常短。
$ mii sync
[15:55:02] INFO All modules up to date :)
基本使用¶
如果程序只有一个版本,直接使用,mii 会自动载入对应的 Module
$ beast -help
[mii] loading beast/10.5.0 ...
Usage: beast [-verbose] [-warnings] [-strict] [-window] [-options] [-working] [-seed] [-prefix <PREFIX>] [-overwrite] [-errors <i>] [-threads <i>] [-fail_threads] [-ignore_versions] [-java] [-tests] [-threshold <r>] [-show_operators] [-adaptation_off] [-adaptation_target <r>] [-pattern_compression <off|unique|ambiguous_constant|ambiguous_all>] [-ambiguous_threshold <r>] [-beagle] [-beagle_info] [-beagle_auto] [-beagle_order <order>] [-beagle_instances <i>] [-beagle_multipartition <auto|on|off>] [-beagle_CPU] [-beagle_GPU] [-beagle_SSE] [-beagle_SSE_off] [-beagle_threading_off] [-beagle_threads <i>] [-beagle_cuda] [-beagle_opencl] [-beagle_single] [-beagle_double] [-beagle_async] [-beagle_low_memory] [-beagle_extra_buffer_count <buffer_count>] [-beagle_scaling <default|dynamic|delayed|always|none>] [-beagle_delay_scaling_off] [-beagle_rescale] [-mpi] [-particles <FOLDER>] [-mc3_chains <i>] [-mc3_delta <r>] [-mc3_temperatures] [-mc3_swap <i>] [-mc3_scheme <NAME>] [-load_state <FILENAME>] [-save_stem <FILENAME>] [-save_at] [-save_time <HH:mm:ss>] [-save_every] [-save_state <FILENAME>] [-full_checkpoint_precision] [-force_resume] [-citations_file <FILENAME>] [-citations_off] [-plugins_dir <FILENAME>] [-version] [-help] [<input-file-name>]
$ blast
[mii] blast not found! Similar commands: "bfast", "blastn", "blastx"
$ blastn
[mii] Please select a module to run blastn:
MODULE PARENT(S)
1 BLAST+/2.15.0
2 BLAST+/2.10.1
3 BLAST+/2.9.0
4 BLAST+/2.8.1
5 BLAST+/2.7.1
6 BLAST+/2.6.0
7 BLAST+/2.5.0-foss-2016b
8 RMBlast/2.10.0
9 RMBlast/2.9.0
10 RMBlast/2.2.28
Make a selection (1-10, q aborts) [1]: 1
[mii] loading BLAST+/2.15.0 ...
BLAST query/options error: Either a BLAST database or subject sequence(s) must be specified
Please refer to the BLAST+ user manual.
搜索哪些哪些软件中有命令 blastn
$ mii search blastn
Results for "blastn": (total 16)
MODULE COMMAND PARENT(S) RELEVANCE
BLAST+/2.15.0 blastn exact
BLAST+/2.10.1 blastn exact
BLAST+/2.9.0 blastn exact
BLAST+/2.8.1 blastn exact
BLAST+/2.7.1 blastn exact
BLAST+/2.6.0 blastn exact
BLAST+/2.5.0-foss-2016b blastn exact
RMBlast/2.10.0 blastn exact
RMBlast/2.9.0 blastn exact
RMBlast/2.2.28 blastn exact
BLAST+/2.15.0 blastx high
BLAST+/2.15.0 blastp high
BLAST+/2.15.0 tblastn high
BLAST+/2.10.1 blastx high
BLAST+/2.10.1 blastp high
BLAST+/2.10.1 tblastn high
$ mii list
Indexed modules (total 1435):
bitmap/1.0.9 /public/software/modulfile/bitmap/1.0.9
7zip/16.02 /public/software/modulfile/7zip/16.02
EMBOSS/6.5.7 /public/software/modulfile/EMBOSS/6.5.7
GenomeMapper/0.4.4 /public/software/modulfile/GenomeMapper/0.4.4
cudnn/9.4.0_cuda12 /public/software/modulfile/cudnn/9.4.0_cuda12
Stereopy/1.6.0-py3.8 /public/software/modulfile/Stereopy/1.6.0-py3.8
bsmap/2.90 /public/software/modulfile/bsmap/2.90
texlive/2020 /public/software/modulfile/texlive/2020
psm/psm/ptl_ips/ptl_fwd.h /public/software/modulfile/psm/psm/ptl_ips/ptl_fwd.h
nf-core/2.14.1 /public/software/modulfile/nf-core/2.14.1
GROMACS/2025.2-GPU /public/software/modulfile/GROMACS/2025.2-GPU
功能改进¶
子串匹配¶
mii 模糊搜索时的策略使用的 Damerau–Levenshtein 距离,其对子串匹配不够友好,如 mii search gmx
无法搜索到想要的 gmx_mpi
,如下所示。
$ mii search gmx
Results for "gmx": (total 16)
MODULE COMMAND PARENT(S) RELEVANCE
arm/fcs-gx/0.5.4 gx high
x86/gcc/14.2.0 gm2 high
arm/amber/24-hmpi-py3.9 ttx medium
arm/amber/24-hmpi-py3.9 gwh medium
arm/annosine/2.0-py3.9 ttx medium
arm/cellphonedb/5.0.1-py3.11 ttx medium
arm/checkm-genome/1.2.3-py3.9 ttx medium
arm/deeptools/3.5.5 ttx medium
arm/drep/3.4.2-py3.9 ttx medium
arm/fep-spell-abfe/1.0.1-py3.9 ttx medium
arm/gcc/10.3.0 g++ medium
arm/gcc/10.3.0 gdc medium
arm/gcc/10.3.0 gcc medium
arm/gcc/10.3.0 go medium
arm/gcc/12.2.0 g++ medium
arm/gcc/12.2.0 gcc medium
modtable.c
,在 mii_modtable_search_similar
函数中添加 2 行代码,实现如果有 3 个连续字符串完全匹配,则将其显示优先级置为 high
。 for (int i = 0; i < MII_MODTABLE_HASHTABLE_WIDTH; ++i) {
mii_modtable_entry* cur = p->buf[i];
while (cur) {
for (int j = 0; j < cur->num_bins; ++j) {
int dist = mii_levenshtein_distance(cmd, cur->bins[j]);
// 添加 2 行代码,实现如果有 3 个连续字符串完全匹配,则将显示优先级其置为 `high`
if (dist != 0 && strlen(cmd) >= 3 && strstr(cur->bins[j], cmd) != NULL) {
dist = 1;
}
if (dist < MII_MODTABLE_DISTANCE_THRESHOLD) {
/* show different parents as different results */
for (int k = 0; k < cur->num_parents; ++k) {
mii_search_result_add(res, cur->code, cur->bins[j], dist, cur->parents[k]);
}
/* if no parents, send null */
if (cur->num_parents == 0) {
mii_search_result_add(res, cur->code, cur->bins[j], dist, NULL);
}
}
}
cur = cur->next;
}
}
mii search gmx
Results for "gmx": (total 16)
MODULE COMMAND PARENT(S) RELEVANCE
arm/fcs-gx/0.5.4 gx high
arm/gromacs/2019.5-hmpi xplor2gmx.pl high
arm/gromacs/2019.5-hmpi gmx_mpi high
arm/gromacs/2019.5 xplor2gmx.pl high
arm/gromacs/2019.5 gmx_mpi high
arm/gromacs/2021.7-plumed xplor2gmx.pl high
arm/gromacs/2021.7-plumed gmx_mpi high
arm/gromacs/2025.2-hmpi xplor2gmx.pl high
arm/gromacs/2025.2-hmpi gmx_mpi high
x86/gcc/14.2.0 gm2 high
x86/gromacs/2025.2-openmpi xplor2gmx.pl high
x86/gromacs/2025.2-openmpi gmx_mpi high
arm/amber/24-hmpi-py3.9 ttx medium
arm/amber/24-hmpi-py3.9 gwh medium
arm/annosine/2.0-py3.9 ttx medium
arm/cellphonedb/5.0.1-py3.11 ttx medium
排序规则¶
除 exact
外,同级别,如 top,匹配的命令越短、显示越靠前。
在 search_result.c
中 _mii_search_result_compare
内添加如下代码。
int _mii_search_result_compare(mii_search_result* res, int a, int b) {
int diff;
/* compare binary distances */
diff = res->distances[a] - res->distances[b];
if (diff > 0) return 1;
if (diff < 0) return -1;
// 添加代码:同级别中,如high,命令越短排序越靠前
/* compare bins length */
diff = strlen(res->bins[a]) - strlen(res->bins[b]);
if (diff != 0) return diff;
/* compare priorities */
diff = res->priorities[a] - res->priorities[b];
if (diff < 0) return 1;
if (diff > 0) return -1;
/* compare parent alpha */
diff = strcmp(res->parents[a], res->parents[b]);
if (diff < 0) return 1;
if (diff > 0) return -1;
/* finally, compare code alpha + version */
return _mii_search_result_compare_codes(res->codes[a], res->codes[b]);
}
mii search augus
Results for "augus": (total 16)
MODULE COMMAND PARENT(S) RELEVANCE
BRAKER/2.1.4 filter_augustus_gff.pl top
GETA/2.4.14 train_augustus.pl top
augustus/3.3.3 augustus top
augustus/3.3.3 augustus2browser.pl top
augustus/3.3.3 augustus2gbrowse.pl top
augustus/3.3.3 optimize_augustus.pl top
augustus/3.3.1 augustus top
augustus/3.3.1 augustus2browser.pl top
augustus/3.3.1 augustus2gbrowse.pl top
augustus/3.3.1 optimize_augustus.pl top
augustus/3.2.3 augustus top
augustus/3.2.3 augustus2browser.pl top
augustus/3.2.3 augustus2gbrowse.pl top
augustus/3.2.3 optimize_augustus.pl top
augustus/2.7 augustus top
augustus/2.7 augustus2browser.pl top
mii search augu
Results for "augu": (total 16)
MODULE COMMAND PARENT(S) RELEVANCE
augustus/3.3.3 augustus high
augustus/3.3.1 augustus high
augustus/3.2.3 augustus high
augustus/2.7 augustus high
GETA/2.4.14 train_augustus.pl high
maker/3.01.03 train_augustus.pl high
maker/3.01.02-beta train_augustus.pl high
maker/3.01.02-beta-MPICH2-test train_augustus.pl high
augustus/3.3.3 augustus2browser.pl high
augustus/3.3.3 augustus2gbrowse.pl high
augustus/3.3.1 augustus2browser.pl high
augustus/3.3.1 augustus2gbrowse.pl high
augustus/3.2.3 augustus2browser.pl high
augustus/3.2.3 augustus2gbrowse.pl high
augustus/2.7 augustus2browser.pl high
augustus/2.7 augustus2gbrowse.pl high
增加匹配级别¶
在目前的 exact
和 high
之间增加一个级别 top
,即 exact
、high
、top``medium
、low
,top
用于表示查询字符串大于等于3、且是被查询命令的完全子串。
代码修改如下
添加 top
级别和对应的颜色表示
const char* relevance_strings[] = {
"exact",
"top", // 增加 top 级别
"high",
"medium",
"low",
};
const char* relevance_colors[] = {
"\033[1;36m", // exact: cyan
"\033[1;35m", // 增加 top 显示的颜色,magenta
"\033[1;32m", // high: green
"\033[1;33m", // medium: yellow
"\033[1;31m", // low: red
};
modtable.c
中 mii_modtable_search_similar
的距离计算逻辑: - 如果 dist != 0 且查询长度 >= 3 且为 bin 的连续子串,设 dist = 1(top)。
- 如果 dist != 0 且不满足子串条件,dist++(将原距离 1 变为 2,2 变为 3,3 变为 4)。
- 完全匹配(dist = 0)保持不变(exact)。
确保结果只添加 dist < MII_MODTABLE_DISTANCE_THRESHOLD 的项
实现效果如下
mii search gmx
Results for "gmx": (total 16)
MODULE COMMAND PARENT(S) RELEVANCE
GROMACS/2025.2-GPU gmx exact
GROMACS/2019.5-GPU gmx exact
GROMACS/2018.3-GPU gmx exact
GROMACS/2025.2-GPU xplor2gmx.pl top
GROMACS/2019.5-GPU gmx_mpi top
GROMACS/2019.5-GPU xplor2gmx.pl top
GROMACS/2019.5 gmx_mpi top
GROMACS/2019.5 xplor2gmx.pl top
GROMACS/2018.3-GPU xplor2gmx.pl top
gtxcat/2.1.0 gtx high
2for1separator/0.2 ttx medium
AlphaPulldown/2.0.4-py3.11 ttx medium
Anaconda2/4.0.0 gio medium
Anaconda2/4.0.0 qml medium
Anaconda3/2021.05 gio medium
Anaconda3/2021.05 qml medium
本站总访问量 次