Monitoring
Monitoring and Observability
Metrics Collection
| Metric Category | Metrics | Purpose | Collection Method |
|---|---|---|---|
| Execution | Duration, success/failure rates, throughput | Performance monitoring | Timer, counter metrics |
| Resource | Memory, CPU, network I/O | System health | System monitoring |
| Provider | Request latency, error rates, rate limits | Service quality | Provider instrumentation |
| Business | Custom application metrics | Business insights | Custom instrumentation |
Logging Requirements
- Structured Logging: JSON format with standardized fields
- Log Levels: Support for debug, info, warn, error levels
- Contextual Information: Include flow ID, node ID, execution context
- Security: No sensitive information in logs
Tracing Support
- Distributed Tracing: OpenTelemetry or similar tracing standards
- Span Creation: Trace spans for flows, nodes, and provider calls
- Context Propagation: Maintain trace context across async operations
- Performance Analysis: Enable performance bottleneck identification
Status and Monitoring
| Status Type | Information | Update Frequency | Access Method |
|---|---|---|---|
| Queue Status | Flows waiting, currently executing | Real-time | Status API endpoint |
| Flow Status | Current node per active flow | Per node execution | Flow tracking API |
| Execution Progress | Node completion status | Per node completion | Progress API |
| Error Details | Detailed error information | On failure | Error logging system |
Logging System
| Log Level | Content | Use Cases | Performance Impact |
|---|---|---|---|
| Debug | Execution traces, variable resolution | Development, troubleshooting | High |
| Info | Flow start/completion, node progress | Production monitoring | Medium |
| Warn | Non-critical issues, performance warnings | Operational alerts | Low |
| Error | Execution failures, validation errors | Error tracking | Low |
Log level is set during FlowExecutor initialization and applies to the entire execution system.