# Data Lineage Tracker for Construction
## Overview
Track the origin, transformations, and flow of construction data through systems. Provides audit trails for compliance, helps debug data issues, and ensures data governance.
## Business Case
Construction projects require data accountability:
- **Audit Compliance**: Know where every number came from
- **Issue Resolution**: Trace data problems to their source
- **Change Impact**: Understand what downstream systems are affected
- **Regulatory Requirements**: Maintain data provenance for legal/insurance
## Technical Implementation
```python
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional, Set
from datetime import datetime
from enum import Enum
import json
import hashlib
import uuid
class TransformationType(Enum):
EXTRACT = "extract"
TRANSFORM = "transform"
LOAD = "load"
AGGREGATE = "aggregate"
JOIN = "join"
FILTER = "filter"
CALCULATE = "calculate"
MANUAL_EDIT = "manual_edit"
IMPORT = "import"
EXPORT = "export"
@dataclass
class DataSource:
id: str
name: str
system: str
location: str
owner: str
created_at: datetime
@dataclass
class TransformationStep:
id: str
transformation_type: TransformationType
description: str
input_entities: List[str]
output_entities: List[str]
logic: str # SQL, Python, or description
performed_by: str # user or system
performed_at: datetime
parameters: Dict[str, Any] = field(default_factory=dict)
@dataclass
class DataEntity:
id: str
name: str
source_id: str
entity_type: str # table, file, field, record
created_at: datetime
version: int = 1
checksum: Optional[str] = None
parent_entities: List[str] = field(default_factory=list)
metadata: Dict[str, Any] = field(default_factory=dict)
@dataclass
class LineageRecord:
id: str
entity_id: st...