nemo-evaluator-sdk
maintained by zechenzhangAGI
star
3.1k
account_tree
259
verified_user
MIT License
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.
Key Features
- Comprehensive skill evaluation and performance tracking
- Community-driven ratings and reviews
- Easy integration with Claude Code
- Regular updates and maintenance
Quick Start
TopRank Skills install zechenzhangAGI/nemo-evaluator
chat Comments (0)
Sign in to join the discussion and leave a comment.
Skill Details
GitHub Stars
3.1k
GitHub Forks
259
Created
Jan 2026
Last Updated
il y a 4 mois
tools
tools llm ai
Related Skills
Build your own?
Join 12,000+ developers contributing to the Claude ecosystem.
No comments yet. Be the first to share your thoughts!