FlowTask Documentation¶
Welcome to FlowTask - A powerful workflow automation framework built in Python.
What is FlowTask?¶
FlowTask is a comprehensive workflow automation framework that allows you to organize and execute complex data processing tasks using a simple YAML configuration format. It supports various data sources, processing components, and output formats.
Key Features¶
- YAML-based Configuration: Define complex workflows using simple YAML files
- Modular Components: Extensive library of pre-built components for data processing
- Task Organization: Organize tasks by programs and maintain clean directory structures
- Multiple Storage Backends: Support for filesystem, database, and S3 storage
- HTTP API: RESTful API for task management and execution
- Scheduler Integration: Built-in task scheduling capabilities
- Hook System: Trigger tasks based on webhooks, file changes, or other events
Quick Example¶
name: Company Profile
description: Company Profile from LeadIQ and ZoomInfo
steps:
- OpenWithPandas:
mime: "text/csv"
trim: true
filename: failed_stores.csv
directory: "/home/ubuntu/symbits/marketing/companies"
- CompanyScraper:
use_proxies: true
paid_proxy: true
column_name: "Company Name"
concurrently: false
scrappers:
- leadiq
- rocketreach
- siccode
- explorium
- PandasToFile:
filename: /home/ubuntu/symbits/marketing/companies/rest-companies-{today}.xlsx
mime: application/vnd.ms-excel
masks:
today:
- today
- mask: "%m%d%Y"
Getting Started¶
- Installation - Install FlowTask and its dependencies
- Quick Start - Create your first task
- Configuration - Learn about configuration options
Navigation¶
- Components: Browse all available FlowTask components
- Interfaces: Learn about FlowTask interfaces and base classes
- API Reference: Complete API documentation
- Examples: Real-world usage examples
Architecture¶
FlowTask follows a modular architecture where:
- Tasks are defined in YAML files organized by programs
- Components are reusable processing units that can be chained together
- Interfaces provide common functionality and contracts
- Storage backends handle task persistence and retrieval
- Schedulers manage task execution timing
- Hooks enable event-driven task execution
Get started by exploring the Components to see what FlowTask can do for your workflows!