grpo | Skill Performance & Reviews | TopRankSkills

TopRank Skills

Home / Skills / tools / grpo

grpo

maintained by atrawog

star 0 account_tree 0 verified_user MIT License
bolt View GitHub

Group Relative Policy Optimization for reinforcement learning from human feedback. Covers GRPOTrainer, reward function design, policy optimization, and KL divergence constraints for stable RLHF training. Includes thinking-aware reward patterns.

Key Features

  • Comprehensive skill evaluation and performance tracking
  • Community-driven ratings and reviews
  • Easy integration with Claude Code
  • Regular updates and maintenance

Quick Start

TopRank Skills install atrawog/grpo

chat Comments (0)

chat_bubble_outline

No comments yet. Be the first to share your thoughts!

Skill Details

GitHub Stars 0
GitHub Forks 0
Created Jan 2026
Last Updated il y a 5 mois
tools tools llm ai

Related Skills

ai-sdk

ai-sdk

vercel
star 22.3k
chevron_right
planning-with-files
chevron_right
ui-skills
chevron_right
biomni
chevron_right
building-agents
chevron_right

Build your own?

Join 12,000+ developers contributing to the Claude ecosystem.