Skip to content

Filelist

flowtask.components.FileList

FileList

FileList(loop=None, job=None, stat=None, **kwargs)

Bases: IteratorBase

FileList with optional Parallelization Support.

Overview

This component iterates through a specified directory and returns a list of files based on a provided pattern or individual files. It supports asynchronous processing and offers options for managing empty results and detailed error handling.

.. table:: Properties :widths: auto

+---------------------+----------+-------------------------------------------------------------------------------------------------------+ | Name | Required | Description | +---------------------+----------+-------------------------------------------------------------------------------------------------------+ | directory (str) | Yes | Path to the directory containing the files to be listed. | +---------------------+----------+-------------------------------------------------------------------------------------------------------+ | pattern (str) | No | Optional glob pattern for filtering files (overrides individual files if provided). | +---------------------+----------+-------------------------------------------------------------------------------------------------------+ | filename (str) | No | Name of the files | +---------------------+----------+-------------------------------------------------------------------------------------------------------+ | iterate (bool) | No | Flag indicating whether to iterate through the files and process them sequentially (defaults to True).| +---------------------+----------+-------------------------------------------------------------------------------------------------------+ | generator (bool) | No | Flag controlling the output format: True returns a generator, False (default) returns a list. | +---------------------+----------+-------------------------------------------------------------------------------------------------------+ | file (dict) | No | A dictionary containing two values, "pattern" and "value", "pattern" and "value", | | | | "pattern" contains the path of the file on the server, If it contains the mask "{value}", | | | | then "value" is used to set the value of that mask | +---------------------+----------+-----------+-------------------------------------------------------------------------------------------+ | parallelize | No | If True, the iterator will process rows in parallel. Default is False. | +---------------------+----------+-----------+-------------------------------------------------------------------------------------------+ | num_threads | No | Number of threads to use if parallelize is True. Default is 10. | +---------------------+----------+-----------+-------------------------------------------------------------------------------------------+

Return the list of files in a Directory

Example:

```yaml
FileList:
  directory: /home/ubuntu/symbits/bayardad/files/job_advertising/bulk/
  pattern: '*.csv'
  iterate: true
```

start async

start(**kwargs)

Check if Directory exists.