Architecture Overview¶
The Optics Framework is built on a modular, extensible architecture that enables vision-powered, self-healing test automation. This document provides a high-level overview of the framework's design, core principles, and component relationships.
Design Principles¶
The framework follows several key design principles:
- Modularity: Components are organized into clear, independent modules with well-defined interfaces
- Extensibility: New drivers, element sources, and vision models can be added without modifying core code
- Resilience: Built-in fallback mechanisms ensure tests continue even when primary methods fail
- Separation of Concerns: Clear boundaries between action execution, element detection, and vision processing
- Interface-Based Design: Components communicate through well-defined interfaces, enabling easy substitution
Level 1 – System Context¶
At the system level, the Optics Framework sits between test authors and external automation infrastructure. It exposes multiple entry points (CLI, REST API, Python library) while orchestrating drivers, vision models, and device farms underneath.
graph TB
actorTester["Tester or QA Engineer"]
actorIntegrator["Framework Integrator or Tooling Engineer"]
actorCi["CI or Orchestration System"]
systemOptics["Optics Framework"]
extAutomation["Automation Backends (Appium, Selenium, Playwright, BLE)"]
extDevices["Real Devices or Device Farms"]
extVision["Vision and OCR Services"]
actorTester --> systemOptics
actorIntegrator --> systemOptics
actorCi --> systemOptics
systemOptics --> extAutomation
systemOptics --> extDevices
systemOptics --> extVision Level 2 – Containers and Layers¶
The rest of this document describes the Optics Framework's internal containers and layers (C4 Level 2), and how they collaborate at runtime.
The framework is organized into several key layers:
graph TB
subgraph "User Interface Layer"
CLI[CLI Commands]
API[REST API]
Library[Python Library]
end
subgraph "Core Layer"
Optics[Optics Class]
SessionMgr[SessionManager]
Config[ConfigHandler]
end
subgraph "Builder Layer"
Builder[OpticsBuilder]
Factories[Factories]
end
subgraph "API Layer"
ActionKW[ActionKeyword]
AppMgmt[AppManagement]
Verifier[Verifier]
FlowCtrl[FlowControl]
end
subgraph "Strategy Layer"
StrategyMgr[StrategyManager]
Strategies[Location Strategies]
end
subgraph "Engine Layer"
Drivers[Drivers]
ElementSources[Element Sources]
VisionModels[Vision Models]
end
subgraph "Execution Layer"
ExecEngine[ExecutionEngine]
TestRunner[TestRunner]
EventSys[Event System]
end
CLI --> Optics
API --> Optics
Library --> Optics
Optics --> SessionMgr
SessionMgr --> Builder
Builder --> Factories
Factories --> Drivers
Factories --> ElementSources
Factories --> VisionModels
Builder --> ActionKW
Builder --> AppMgmt
Builder --> Verifier
Builder --> FlowCtrl
ActionKW --> StrategyMgr
StrategyMgr --> Strategies
Strategies --> ElementSources
Strategies --> VisionModels
ExecEngine --> TestRunner
TestRunner --> ActionKW
TestRunner --> AppMgmt
TestRunner --> Verifier
TestRunner --> FlowCtrl
ExecEngine --> EventSys The layers have the following responsibilities and primary references:
- User Interface Layer:
- Entry points for humans and tooling via CLI commands, the REST API, and the Python library.
- Translates user intent into high-level operations without dealing directly with drivers or vision models.
-
See CLI Layer and REST API Layer.
-
Core Layer:
- Hosts the
Opticsfacade,SessionManager, andConfigHandler. - Owns configuration and session lifecycle, bridging interface-layer calls into the builder and execution pipeline.
-
See Component Architecture for core component details.
-
Builder Layer:
- Uses
OpticsBuilderand factories to turn configuration into concrete driver, element source, and vision model instances. - Central place for dependency wiring and fallback chains between engines.
-
See Component Architecture and Engines.
-
API Layer:
- Provides high-level, user-facing keywords for actions, app management, verification, and flow control.
- Encapsulates orchestration logic while delegating low-level work to strategies and engines.
-
See Component Architecture and Strategies.
-
Strategy Layer:
- Implements self-healing location strategies that combine DOM, text, and image-based techniques.
- Decides which element or vision strategy to apply and in what order.
-
See Strategies for details.
-
Engine Layer:
- Contains concrete automation backends (drivers), element sources, and vision models.
- Talks to external systems such as Appium/Selenium/Playwright, device farms, and OCR/image services.
-
See Engines for driver, element source, and vision model implementations.
-
Execution Layer:
- Coordinates test execution, result collection, and event publishing.
- Provides runners and event infrastructure that drive API-layer keywords over time.
- See Execution and Event System.
For a deeper, component-level (C4 Level 3) view and the concrete design patterns used in Optics, see: - Component Architecture: Component Architecture for Optics, SessionManager, builders, factories, and interfaces. - Strategies and Engines: Strategies and Engines for self-healing strategies and engine implementations. - Architecture Decisions: Architecture Decisions for the rationale behind major architectural choices.
Data Flow¶
Session Creation Flow¶
sequenceDiagram
participant User
participant Optics
participant SessionMgr
participant Session
participant Builder
participant Factory
User->>Optics: setup(config)
Optics->>SessionMgr: create_session()
SessionMgr->>Session: new Session()
Session->>Builder: new OpticsBuilder()
Builder->>Factory: get_driver(config)
Factory->>Factory: discover & instantiate
Factory-->>Builder: driver instance
Builder-->>Session: configured builder
Session-->>SessionMgr: session created
SessionMgr-->>Optics: session_id Element Location Flow¶
sequenceDiagram
participant ActionKW
participant StrategyMgr
participant Strategy
participant ElementSource
participant VisionModel
ActionKW->>StrategyMgr: locate(element)
StrategyMgr->>Strategy: try XPathStrategy
Strategy->>ElementSource: locate(element)
ElementSource-->>Strategy: result or None
alt XPath fails
StrategyMgr->>Strategy: try TextDetectionStrategy
Strategy->>ElementSource: capture()
ElementSource-->>Strategy: screenshot
Strategy->>VisionModel: detect_text(screenshot)
VisionModel-->>Strategy: text locations
end
Strategy-->>StrategyMgr: coordinates
StrategyMgr-->>ActionKW: element location Cross-Cutting Concerns¶
These concerns cut across multiple layers and containers, and collectively shape the framework's runtime behavior and quality attributes.
Self-Healing Mechanism¶
The framework implements self-healing through multiple layers:
- Strategy Fallback: If one location strategy fails, try the next
- Driver Fallback: If one driver fails, try the next configured driver
- Element Source Fallback: If one element source fails, try the next
- Vision Model Fallback: If one OCR/image model fails, try the next
This multi-layered approach ensures maximum test resilience.
Configuration Structure¶
Configuration is hierarchical and supports: - Multiple drivers with priority ordering - Multiple element sources with fallback chains - Multiple vision models for redundancy - Per-component configuration options
Extension Points¶
The framework is designed for easy extension:
- New Drivers: Implement
DriverInterfaceinengines/drivers/ - New Element Sources: Implement
ElementSourceInterfaceinengines/elementsources/ - New Vision Models: Implement
ImageInterfaceorTextInterfaceinengines/vision_models/ - New Strategies: Extend
LocatorStrategyincommon/strategies.py - New Keywords: Add methods to API classes with
@keyworddecorator
Next Steps¶
From here you can move from the C4 Level 1 and Level 2 views in this document into more detailed Level 3 component and system documentation.
Containers and Layers (Level 2 → Level 3)¶
- Component Architecture - Deep dive into individual components
- Engines - Driver, element source, and vision model implementations
- Strategies - Strategy pattern and self-healing mechanisms
- Execution - Test execution flow and event system
- Event System - EventSDK and event tracking
- Error Handling - Error codes and error handling system
Interface Entry Points¶
- REST API Layer - REST API architecture and integration
- CLI Layer - Command-line interface architecture
- Library Layer - Python library and programmatic interface
Cross-Cutting Systems and Decisions¶
- Logging Architecture - Logging system and configuration
- Architecture Decisions - ADRs and design rationale
Extension and Customization¶
- Extending the Framework - Creating custom extensions
Related Documentation¶
- REST API Usage - Complete REST API endpoint reference
- API Reference - Python API documentation
- Configuration - Configuration guide
- Quick Start - Getting started guide
- Usage Guide - Usage examples and patterns