OA0
OA0 是一个探索 AI 的社区
现在注册
已注册用户请  登录
社区运行状况
注册会员 1100
主题 846
模型 3026
技能包 13874
数据集 1047
论文 331
开源项目 532
模型名称 厂商 智能评分 编程能力 智能体能力 速度评分
GPT-5.5 (xhigh) OpenAI
60.24
59.12
74.12
63.29
GPT-5.5 (high) OpenAI
58.87
58.53
71.95
62.09
Claude Opus 4.7 (Adaptive Reasoning, Max Effort) Anthropic
57.28
52.51
71.29
43.50
Gemini 3.1 Pro Preview Google
57.18
55.50
59.09
119.49
GPT-5.5 (medium) OpenAI
56.71
56.21
69.39
57.08
Kimi K2.6 Kimi
53.90
47.12
65.97
30.06
MiMo-V2.5-Pro Xiaomi
53.83
45.53
67.44
56.19
GPT-5.3 Codex (xhigh) OpenAI
53.56
53.10
60.54
72.85
Grok 4.3 xAI
53.20
41.03
65.89
78.35
Muse Spark Meta
52.15
47.47
61.99
Claude Opus 4.7 (Non-reasoning, High Effort) Anthropic
51.82
53.07
64.64
38.57
Qwen3.6 Max Preview Alibaba
51.81
44.92
64.83
36.37
Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) Anthropic
51.72
50.94
63.00
51.38
DeepSeek V4 Pro (Reasoning, Max Effort) DeepSeek
51.51
47.47
67.19
33.86
GLM-5.1 (Reasoning) Z AI
51.41
43.37
67.05
51.94
GPT-5.5 (low) OpenAI
50.78
52.06
59.69
56.85
Qwen3.6 Plus Alibaba
49.98
42.87
61.67
52.75
DeepSeek V4 Pro (Reasoning, High Effort) DeepSeek
49.79
43.25
66.65
33.24
GLM-5 (Reasoning) Z AI
49.77
44.18
63.14
63.52
MiniMax-M2.7 MiniMax
49.62
41.93
61.49
47.12
MiMo-V2.5 Xiaomi
49.03
42.13
65.53
92.55
GPT-5.4 mini (xhigh) OpenAI
48.90
51.48
58.88
155.83
GPT-5.4 (low) OpenAI
47.94
45.57
58.22
67.61
GLM-5-Turbo Z AI
46.76
36.77
66.13
DeepSeek V4 Flash (Reasoning, Max Effort) DeepSeek
46.52
38.71
61.28
72.49
Gemini 3 Flash Preview (Reasoning) Google
46.43
42.62
49.66
160.81
Qwen3.6 27B (Reasoning) Alibaba
45.82
36.50
62.85
61.72
Qwen3.5 397B A17B (Reasoning) Alibaba
45.05
41.28
55.83
52.84
MiMo-V2-Omni-0327 Xiaomi
44.93
36.89
58.63
116.21
DeepSeek V4 Flash (Reasoning, High Effort) DeepSeek
44.87
39.76
57.82
Claude Sonnet 4.6 (Non-reasoning, High Effort) Anthropic
44.38
46.43
61.62
42.37
GPT-5.4 nano (xhigh) OpenAI
43.98
43.91
47.60
144.86
GLM-5.1 (Non-reasoning) Z AI
43.82
35.77
66.04
42.37
Qwen3.6 35B A3B (Reasoning) Alibaba
43.49
35.15
58.34
193.59
MiMo-V2-Omni Xiaomi
43.40
35.46
58.56
118.97
Kimi K2.6 (Non-reasoning) Kimi
42.95
38.41
58.73
29.23
GLM 5V Turbo (Reasoning) Z AI
42.85
36.22
61.07
Claude Sonnet 4.6 (Non-reasoning, Low Effort) Anthropic
42.60
42.98
57.45
44.18
Hy3-preview (Reasoning) Tencent
41.85
36.46
55.67
80.66
Qwen3.5 122B A10B (Reasoning) Alibaba
41.60
34.71
53.00
145.09
MiMo-V2-Flash (Feb 2026) Xiaomi
41.46
33.48
48.76
133.76
Gemini 3 Pro Preview (low) Google
41.30
39.36
45.05
GPT-5.5 (Non-reasoning) OpenAI
40.94
48.61
50.24
58.51
GLM-5 (Non-reasoning) Z AI
40.57
39.03
60.27
56.82
Qwen3.5 397B A17B (Non-reasoning) Alibaba
40.10
37.43
53.32
53.56
DeepSeek V4 Pro (Non-reasoning) DeepSeek
39.27
38.36
63.27
33.09
Mistral Medium 3.5 Mistral
39.23
35.42
53.16
143.22
Gemma 4 31B (Reasoning) Google
39.18
38.71
40.94
35.42
Qwen3.5 Omni Plus Alibaba
38.63
27.64
52.83
53.33
Grok 4.1 Fast (Reasoning) xAI
38.61
30.90
49.31
77.52
Step 3.5 Flash 2603 StepFun
38.47
34.56
48.23
180.77
o3 OpenAI
38.37
38.40
36.09
71.84
GPT-5.4 nano (medium) OpenAI
38.11
35.03
41.64
143.78
GPT-5.4 mini (medium) OpenAI
37.73
37.46
40.27
158.68
Kimi K2.5 (Non-reasoning) Kimi
37.27
25.82
52.84
43.92
Qwen3.6 27B (Non-reasoning) Alibaba
37.14
26.56
60.86
61.66
Claude 4.5 Haiku (Reasoning) Anthropic
37.09
32.61
40.16
91.26
DeepSeek V4 Flash (Non-reasoning) DeepSeek
36.46
35.15
61.33
72.61
NVIDIA Nemotron 3 Super 120B A12B (Reasoning) NVIDIA
35.97
31.19
40.18
178.03
Qwen3.5 122B A10B (Non-reasoning) Alibaba
35.87
31.58
49.51
142.67
Nova 2.0 Pro Preview (medium) Amazon
35.71
30.40
46.95
143.32
MiMo-V2.5-Pro (Non-reasoning) Xiaomi
35.59
36.78
50.75
60.53
GPT-5.4 (Non-reasoning) OpenAI
35.39
40.95
39.14
59.29
Gemini 3 Flash Preview (Non-reasoning) Google
35.05
37.84
35.01
168.74
Gemini 2.5 Pro Google
34.63
31.95
32.68
125.20
Nova 2.0 Lite (high) Amazon
34.54
23.42
37.34
153.06
Hy3-preview (Non-reasoning) Tencent
33.66
34.33
46.70
81.47
Ling-2.6-1T InclusionAI
33.61
33.05
48.21
Doubao Seed Code ByteDance Seed
33.52
31.26
36.45
Gemini 3.1 Flash-Lite Preview Google
33.52
30.13
25.67
325.76
gpt-oss-120B (high) OpenAI
33.27
28.62
37.87
230.89
Mercury 2 Inception
32.82
30.56
39.74
687.39
Qwen3.5 9B (Reasoning) Alibaba
32.43
25.34
37.42
50.20
Gemma 4 31B (Non-reasoning) Google
32.29
33.90
39.39
K-EXAONE (Reasoning) LG AI Research
32.12
27.03
38.14
Grok 3 mini Reasoning (high) xAI
32.08
25.16
31.15
152.37
Nova 2.0 Pro Preview (low) Amazon
31.90
24.50
37.71
139.49
Trinity Large Thinking Arcee AI
31.87
27.19
42.61
117.20
Qwen3.6 35B A3B (Non-reasoning) Alibaba
31.53
17.60
52.53
186.88
Gemma 4 26B A4B (Reasoning) Google
31.21
22.44
32.15
Claude 4.5 Haiku (Non-reasoning) Anthropic
31.05
29.64
32.59
86.26
Qwen3.5 35B A3B (Non-reasoning) Alibaba
30.69
16.83
48.04
153.02
MiMo-V2-Flash (Non-reasoning) Xiaomi
30.35
25.81
47.34
128.40
EXAONE 4.5 33B LG AI Research
30.23
22.97
36.50
Nova 2.0 Lite (medium) Amazon
29.73
23.88
32.85
173.44
ERNIE 5.0 Thinking Preview Baidu
29.09
29.17
39.74
Grok 4.20 0309 v2 (Non-reasoning) xAI
28.99
22.03
38.31
91.07
Grok Code Fast 1 xAI
28.74
23.69
35.63
74.49
Nemotron Cascade 2 30B A3B NVIDIA
28.35
25.75
26.15
Qwen3 Coder Next Alibaba
28.28
22.89
42.10
151.23
Nova 2.0 Omni (medium) Amazon
28.02
15.11
38.20
Mistral Small 4 (Reasoning) Mistral
27.80
24.27
25.87
150.24
Qwen3.5 9B (Non-reasoning) Alibaba
27.33
21.35
41.10
Magistral Medium 1.2 Mistral
27.10
21.66
24.45
40.12
Gemma 4 26B A4B (Non-reasoning) Google
27.09
29.09
28.93
Qwen3.5 4B (Reasoning) Alibaba
27.08
17.49
32.46
205.10
DeepSeek R1 0528 (May '25) DeepSeek
27.07
24.03
20.84
Qwen3 Next 80B A3B (Reasoning) Alibaba
26.72
19.49
23.56
157.55
Ling 2.6 Flash InclusionAI
26.16
23.17
38.06
208.75
Qwen3.5 Omni Flash Alibaba
25.87
14.04
41.63
240.15
Solar Pro 3 Upstage
25.87
13.27
34.92
JT-MINI China Mobile
25.37
21.19
42.36
Nova 2.0 Lite (low) Amazon
24.59
13.64
27.77
167.01
gpt-oss-120B (low) OpenAI
24.47
15.53
28.04
241.61
gpt-oss-20B (high) OpenAI
24.47
18.53
27.60
249.55
GPT-5.4 nano (Non-Reasoning) OpenAI
24.36
27.89
25.92
146.34
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) NVIDIA
24.27
18.97
19.14
187.69
LongCat Flash Lite LongCat
23.93
16.52
38.76
120.64
Grok 4.1 Fast (Non-reasoning) xAI
23.56
19.47
32.95
71.70
K-EXAONE (Non-reasoning) LG AI Research
23.41
13.53
31.22
GPT-5.4 mini (Non-Reasoning) OpenAI
23.28
25.32
25.01
144.30
Nova 2.0 Omni (low) Amazon
23.22
13.95
22.61
Mi:dm K 2.5 Pro Korea Telecom
23.06
12.57
36.76
Nova 2.0 Pro Preview (Non-reasoning) Amazon
23.06
20.49
23.88
113.00
Mistral Large 3 Mistral
22.80
22.68
21.70
48.24
Ring-1T InclusionAI
22.78
16.78
18.31
Qwen3.5 4B (Non-reasoning) Alibaba
22.60
13.67
36.26
206.42
INTELLECT-3 Prime Intellect
22.17
19.10
19.81
Devstral 2 Mistral
22.04
23.66
21.86
56.68
Solar Open 100B (Reasoning) Upstage
21.67
10.47
24.29
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) Google
21.65
18.15
11.67
Nemotron 3 Nano Omni 30B A3B Reasoning NVIDIA
21.43
14.81
23.87
311.10
gpt-oss-20B (low) OpenAI
20.79
14.37
21.86
247.30
Qwen3 Next 80B A3B Instruct Alibaba
20.11
15.27
14.19
148.66
Devstral Small 2 Mistral
19.47
20.72
20.79
56.74
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) Google
19.42
14.54
10.14
Motif-2-12.7B-Reasoning Motif Technologies
19.08
11.94
19.17
Ling-1T InclusionAI
19.04
18.80
11.84
Nova Premier Amazon
19.01
13.84
16.43
27.70
Gemma 4 E4B (Reasoning) Google
18.76
13.70
6.92
56.22
Llama Nemotron Super 49B v1.5 (Reasoning) NVIDIA
18.68
15.15
9.36
47.19
Mistral Small 4 (Non-reasoning) Mistral
18.62
16.45
18.55
143.36
Llama 3.3 Nemotron Super 49B v1 (Reasoning) NVIDIA
18.49
9.41
Llama 4 Maverick Meta
18.36
15.58
7.22
107.36
Magistral Small 1.2 Mistral
18.16
14.76
17.33
106.10
Sarvam 105B (high) Sarvam
18.16
9.81
24.69
100.02
Nova 2.0 Lite (Non-reasoning) Amazon
18.03
12.53
21.06
149.10
Llama 3.1 Instruct 405B Meta
17.38
14.50
6.34
29.27
EXAONE 4.0 32B (Reasoning) LG AI Research
16.68
13.98
9.53
Nova 2.0 Omni (Non-reasoning) Amazon
16.61
13.84
14.91
191.72
Qwen3.5 2B (Reasoning) Alibaba
16.29
3.45
23.00
Nanbeige4.1-3B Nanbeige
16.08
8.87
7.21
Ministral 3 14B Mistral
15.98
10.90
17.39
137.45
DeepSeek R1 Distill Llama 70B DeepSeek
15.95
11.43
43.52
Falcon-H1R-7B TII UAE
15.80
9.81
9.26
Ling-flash-2.0 InclusionAI
15.74
16.72
8.35
83.49
Qwen3 Omni 30B A3B (Reasoning) Alibaba
15.62
12.71
10.61
79.71
Step3 VL 10B StepFun
15.45
13.91
5.36
Gemma 4 E2B (Reasoning) Google
15.21
9.00
6.92
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) NVIDIA
15.02
13.09
3.80
40.57
ERNIE 4.5 300B A47B Baidu
14.96
14.53
0.00
21.79
Solar Pro 2 (Reasoning) Upstage
14.92
12.09
11.43
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) NVIDIA
14.89
11.75
7.12
131.71
Ministral 3 8B Mistral
14.84
9.97
16.66
156.66
Gemma 4 E4B (Non-reasoning) Google
14.83
6.36
8.67
53.59
NVIDIA Nemotron Nano 9B V2 (Reasoning) NVIDIA
14.76
8.34
9.43
121.38
Granite 4.1 30B IBM
14.69
10.12
14.04
NVIDIA Nemotron 3 Nano 4B NVIDIA
14.68
10.02
9.75
Qwen3.5 2B (Non-reasoning) Alibaba
14.67
4.92
27.19
357.48
Llama Nemotron Super 49B v1.5 (Non-reasoning) NVIDIA
14.59
10.47
8.38
45.43
Llama 3.3 Instruct 70B Meta
14.49
10.70
9.09
82.99
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) NVIDIA
14.43
Kimi Linear 48B A3B Instruct Kimi
14.41
14.21
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) NVIDIA
14.35
7.64
Ring-flash-2.0 InclusionAI
14.02
10.64
0.00
89.35
Solar Pro 2 (Non-reasoning) Upstage
13.59
11.29
12.71
Llama 4 Scout Meta
13.52
6.68
5.17
140.34
Command A Cohere
13.48
9.88
5.07
35.96
Llama 3.1 Nemotron Instruct 70B NVIDIA
13.44
10.78
7.70
37.09
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) NVIDIA
13.17
15.76
8.48
87.32
NVIDIA Nemotron Nano 9B V2 (Non-reasoning) NVIDIA
13.16
7.49
7.80
125.60
Granite 4.1 8B IBM
12.38
7.25
10.67
89.50
Sarvam 30B (high) Sarvam
12.34
7.92
11.50
255.39
Gemma 4 E2B (Non-reasoning) Google
12.10
8.31
7.41
R1 1776 Perplexity
11.99
Llama 3.2 Instruct 90B (Vision) Meta
11.90
57.76
EXAONE 4.0 32B (Non-reasoning) LG AI Research
11.66
9.42
1.36
Ministral 3 3B Mistral
11.24
4.78
11.44
270.97
Jamba 1.7 Large AI21 Labs
10.88
7.77
4.48
61.27
Granite 4.0 H Small IBM
10.81
8.50
5.75
248.06
Qwen3 Omni 30B A3B Instruct Alibaba
10.68
7.22
5.46
95.12
Qwen3.5 0.8B (Reasoning) Alibaba
10.52
0.00
15.89
LFM2 24B A2B Liquid AI
10.49
3.63
3.70
86.52
Phi-4 Microsoft
10.41
11.21
0.00
35.86
Nova Micro Amazon
10.27
4.14
4.68
282.15
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) NVIDIA
10.09
5.86
6.43
168.19
Phi-4 Multimodal Instruct Microsoft
10.04
16.54
Qwen3.5 0.8B (Non-reasoning) Alibaba
9.91
0.96
21.73
358.99
Jamba Reasoning 3B AI21 Labs
9.60
2.47
5.26
Reka Flash 3 Reka AI
9.52
8.91
0.00
Ling-mini-2.0 InclusionAI
9.19
5.02
4.39
Llama 3.2 Instruct 11B (Vision) Meta
8.73
4.25
4.87
48.90
Granite 4.1 3B IBM
8.54
5.49
6.53
Phi-4 Mini Instruct Microsoft
8.39
3.59
2.73
43.59
Exaone 4.0 1.2B (Reasoning) LG AI Research
8.26
3.09
5.46
Exaone 4.0 1.2B (Non-reasoning) LG AI Research
8.11
2.47
6.82
LFM2.5-1.2B-Thinking Liquid AI
8.08
1.39
6.53
Jamba 1.7 Mini AI21 Labs
8.07
3.09
4.19
LFM2 2.6B Liquid AI
8.04
1.35
4.48
LFM2.5-1.2B-Instruct Liquid AI
8.04
0.77
3.61
Granite 4.0 H 1B IBM
7.99
2.74
6.53
Gemma 3 270M Google
7.71
0.00
3.02
Apertus 70B Instruct Swiss AI Initiative
7.70
1.89
4.29
Granite 4.0 Micro IBM
7.67
4.98
4.19
Granite 4.0 1B IBM
7.34
2.89
7.60
LFM2 8B A1B Liquid AI
7.03
2.28
3.51
LFM2.5-VL-1.6B Liquid AI
6.18
1.00
2.83
Granite 4.0 350M IBM
6.10
0.31
4.39
Apertus 8B Instruct Swiss AI Initiative
5.88
1.35
3.80
Granite 4.0 H 350M IBM
5.44
0.58
4.87
Tiny Aya Global Cohere
4.74
1.20
0.00
EXAONE 4.5 33B (Non-reasoning) LG AI Research
GPT-5.5 Pro (xhigh) OpenAI
Gemini 3 Deep Think Google
Mi:dm K 2.5 Pro Preview Korea Telecom
11.94
AI模型天梯榜数据来源:Artificial Analysis - Comparison of AI Models
方法论:/r/docs/methodology
关于 ·  帮助 ·  PING ·  隐私 ·  条款   
OA0 - Omni AI 0 一个探索 AI 的社区
沪ICP备2024103595号-2
耗时 602 ms
Developed with Cursor