Metadata-Version: 2.1
Name: flowtask
Version: 5.7.18
Summary: Framework for running Tasks and from CLI and API for orchestation. Component-based Task builder/Runner for non-programmers.
Home-page: https://github.com/phenobarbital/flowtask
Author: Jesus Lara
Author-email: "Jesus Lara G." <jesuslarag@gmail.com>
License: Apache-2.0
Project-URL: Source, https://github.com/phenobarbital/flowtask
Project-URL: Funding, https://paypal.me/phenobarbital
Project-URL: Say Thanks!, https://saythanks.io/to/phenobarbital
Keywords: DataIntegration,Task,Orchestation,Task-Runner,Pipelines,Data-Pipelines
Platform: *nix
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python
Classifier: Typing :: Typed
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Framework :: AsyncIO
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.9.16
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: borax==3.5.0
Requires-Dist: PyDrive==1.3.1
Requires-Dist: chardet==5.2.0
Requires-Dist: aiohttp-jinja2==1.6
Requires-Dist: asyncssh[bcrypt,fido2,libnacl,pkcs11,pyopenssl]==2.18.0
Requires-Dist: pyxlsb==1.0.10
Requires-Dist: python-calamine==0.2.3
Requires-Dist: pyecharts==2.0.7
Requires-Dist: selenium==4.28.1
Requires-Dist: snapshot-selenium==0.0.2
Requires-Dist: selenium-wire==5.1.0
Requires-Dist: httpx[http2,socks]>=0.26.0
Requires-Dist: h2==4.2.0
Requires-Dist: backoff==2.2.1
Requires-Dist: blinker==1.7.0
Requires-Dist: webdriver-manager==4.0.2
Requires-Dist: aioimaplib==1.1.0
Requires-Dist: adal==1.2.7
Requires-Dist: xlrd==2.0.1
Requires-Dist: zeep==4.2.1
Requires-Dist: nltk==3.9.1
Requires-Dist: jdcal==1.4.1
Requires-Dist: html5lib==1.1
Requires-Dist: shapely==2.0.6
Requires-Dist: tzwhere==3.0.3
Requires-Dist: tabulate==0.9.0
Requires-Dist: python-magic==0.4.27
Requires-Dist: pytomlpp==1.0.13
Requires-Dist: psutil==6.0.0
Requires-Dist: networkx==3.4.1
Requires-Dist: gitpython==3.1.43
Requires-Dist: watchdog==4.0.2
Requires-Dist: hachiko==0.4.0
Requires-Dist: paramiko==3.4.0
Requires-Dist: jira==3.8.0
Requires-Dist: datatable==1.1.0
Requires-Dist: async-notify[all]>=1.4.0
Requires-Dist: querysource>=3.15.42
Requires-Dist: asyncdb[all]>=2.11.7
Requires-Dist: caio==0.9.11
Requires-Dist: Wand==0.6.13
Requires-Dist: pylibdmtx==0.1.10
Requires-Dist: aiofile>=3.8.8
Requires-Dist: proxylists>=0.14.0
Requires-Dist: aioftp==0.23.1
Requires-Dist: py7zr==0.22.0
Requires-Dist: rarfile==4.2
Requires-Dist: python-pptx==1.0.2
Requires-Dist: gspread==6.1.4
Requires-Dist: oauth2client>=4.1.3
Requires-Dist: google-auth>=2.35.0
Requires-Dist: google-cloud-language>=2.15.0
Requires-Dist: googlemaps==4.10.0
Requires-Dist: fastavro==1.9.7
Requires-Dist: pgpy==0.6.0
Requires-Dist: dropbox==12.0.2
Requires-Dist: dask-expr==1.1.13
Requires-Dist: jsonpath-ng==1.7.0
Requires-Dist: ortools==9.11.4210
Requires-Dist: gmqtt==0.7.0
Requires-Dist: selectorlib==0.16.0
Requires-Dist: playwright==1.49.1
Requires-Dist: praw==7.8.1
Requires-Dist: prawcore==2.4.0
Requires-Dist: osmnx==2.0.1
Requires-Dist: pyrosm==0.6.2
Requires-Dist: osmium==4.0.2
Requires-Dist: python-Levenshtein==0.26.1
Requires-Dist: undetected-chromedriver==3.5.5
Requires-Dist: duckduckgo-search==7.5.0
Requires-Dist: numba>=0.59.0
Requires-Dist: APScheduler<3.11.0,>=3.10.4
Requires-Dist: google-cloud-bigquery>=3.30.0
Requires-Dist: imagehash==4.3.1
Requires-Dist: pgvector==0.3.6
Provides-Extra: ai
Requires-Dist: ai-parrot[agents,anthropic,google,groq,milvus,openai,vector]>=0.4.7; extra == "ai"
Requires-Dist: timm==1.0.15; extra == "ai"
Requires-Dist: torchvision==0.21.0; extra == "ai"
Requires-Dist: ultralytics==8.3.111; extra == "ai"
Provides-Extra: codereview
Requires-Dist: ai-parrot[google,groq,milvus]>=0.4.7; extra == "codereview"
Provides-Extra: loaders
Requires-Dist: mammoth==1.8.0; extra == "loaders"
Requires-Dist: markdownify==0.13.1; extra == "loaders"
Requires-Dist: python-docx==1.1.2; extra == "loaders"
Requires-Dist: pymupdf==1.25.1; extra == "loaders"
Requires-Dist: pymupdf4llm==0.0.17; extra == "loaders"
Requires-Dist: pdf4llm==0.0.9; extra == "loaders"
Requires-Dist: langchain-huggingface==0.1.2; extra == "loaders"
Provides-Extra: milvus
Requires-Dist: langchain-milvus>=0.1.6; extra == "milvus"
Requires-Dist: pymilvus==2.4.8; extra == "milvus"
Requires-Dist: milvus==2.3.5; extra == "milvus"
Provides-Extra: vertexai
Requires-Dist: langchain-google-genai==2.0.1; extra == "vertexai"
Requires-Dist: langchain-google-vertexai==2.0.5; extra == "vertexai"
Requires-Dist: vertexai==1.70.0; extra == "vertexai"

# FlowTask DataIntegration #

FlowTask DataIntegration is a plugin-based, component-driven task execution framework for create complex Tasks.

FlowTask runs Tasks defined in JSON, YAML or TOML files, any Task is a combination of Components,
and every component in the Task run sequentially or depend of others, like a DAG.

Can create a Task combining Commands, Shell scripts and other specific Components (as TableInput: Open a Table using a datasource, DownloadFromIMAP: Download a File from a IMAP Folder, and so on), any Python Callable can be a Component inside a Task, or can extends UserComponent to build your own componets.

Every designed Task can run from CLI, programmatically, via RESTful API (using our aioHTTP-based Handler), called by WebHooks or even dispatched to a external Worker using our built-in Scheduler.

## Quickstart ##

```console
pip install flowtask
```

Tasks can organizated into directory structure like this:

tasks /
    ├── programs /
      ├── test /
           ├── tasks /

The main reason of this structure, is maintain organized several tasks by tenant/program, avoiding filling a directory with several task files.

FlowTask support "TaskStorage", a Task Storage is the main repository for tasks, main Task Storage is a directory in any filesystem path (optionally you can syncronize that path using git), but Tasks can be saved onto a Database or a S3 bucket.

## Dependencies ##

 * aiohttp (Asyncio Web Framework and Server) (required by navigator)
 * AsyncDB
 * QuerySource
 * Navigator-api
 * (Optional) Qworker (for distributing asyncio Tasks on distributed workers).

## Features ##

* Component-based Task execution framework with several components covering several actions (download files, create pandas dataframes from files, mapping dataframe columns to a json-dictionary, etc)
* Built-in API for execution of Tasks.

### How I run a Task? ###

Can run a Task from CLI:
```console
task --program=test --task=example
```

on CLI, you can pass an ENV (enviroment) to change the environment file on task execution.
```console
ENV=dev task --program=test --task=example
```

or Programmatically:
```python
from flowtask import Task
import asyncio

task = Task(program='test', task='example')
results = asyncio.run(task.run())
# we can alternatively, using the execution mode of task object:
results = asyncio.run(task())
```

### Requirements ###

* Python >= 3.9
* asyncio (https://pypi.python.org/pypi/asyncio/)
* aiohttp >= 3.6.2

### Contribution guidelines ###

Please have a look at the Contribution Guide

* Writing tests
* Code review
* Other guidelines

### Who do I talk to? ###

* Repo owner or admin
* Other community or team contact

### License ###

Navigator is licensed under Apache 2.0 License. See the LICENSE file for more details.
