TopRank Skills

Home / Claw Skills / Others / data-profiler
Official OpenClaw rules 15%

data-profiler

Profile construction data to understand characteristics, distributions, quality metrics, and patterns. Essential for data quality assessment and ETL planning.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 data-profiler 技能。 若已安装,则直接安装 data-profiler 技能。

Overview

Skill Key
datadrivenconstruction/data-profiler
Author
datadrivenconstruction
Source Repo
openclaw/skills
Version
-
Source Path
skills/datadrivenconstruction/data-profiler
Latest Commit SHA
55dc124febd89e54f458b5143dd8a685f1a12e93

Extracted Content

SKILL.md excerpt

# Data Profiler for Construction

## Overview

Analyze construction data to understand its characteristics, distributions, quality, and patterns. Essential for data quality assessment, ETL planning, and identifying data issues before they impact projects.

## Business Case

Before using any construction data, you need to understand:
- What data types are present
- Distribution of values
- Missing data patterns
- Anomalies and outliers
- Referential integrity issues

This skill profiles data to answer these questions and provides actionable insights.

## Technical Implementation

```python
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional, Tuple
import pandas as pd
import numpy as np
from datetime import datetime
import json

@dataclass
class ColumnProfile:
    name: str
    data_type: str
    inferred_type: str  # More specific: project_id, cost, date, csi_code, etc.
    total_count: int
    null_count: int
    null_percentage: float
    unique_count: int
    uniqueness_ratio: float
    # For numeric columns
    min_value: Optional[float] = None
    max_value: Optional[float] = None
    mean_value: Optional[float] = None
    median_value: Optional[float] = None
    std_dev: Optional[float] = None
    # For string columns
    min_length: Optional[int] = None
    max_length: Optional[int] = None
    avg_length: Optional[float] = None
    # Top values
    top_values: List[Tuple[Any, int]] = field(default_factory=list)
    # Patterns
    common_patterns: List[str] = field(default_factory=list)
    # Quality flags
    quality_issues: List[str] = field(default_factory=list)

@dataclass
class DataProfile:
    source_name: str
    row_count: int
    column_count: int
    columns: List[ColumnProfile]
    duplicate_rows: int
    memory_usage: str
    profiled_at: datetime
    quality_score: float
    recommendations: List[str]

class ConstructionDataProfiler:
    """Profile...

Related Claw Skills