Filelist¶
flowtask.components.FileList
¶
FileList
¶
Bases: IteratorBase
FileList with optional Parallelization Support.
Overview
This component iterates through a specified directory and returns a list of files based on a provided pattern or individual files. It supports asynchronous processing and offers options for managing empty results and detailed error handling.
.. table:: Properties :widths: auto
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| Name | Required | Description |
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| directory (str) | Yes | Path to the directory containing the files to be listed. |
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| pattern (str) | No | Optional glob pattern for filtering files (overrides individual files if provided). |
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| filename (str) | No | Name of the files |
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| iterate (bool) | No | Flag indicating whether to iterate through the files and process them sequentially (defaults to True).|
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| generator (bool) | No | Flag controlling the output format: True returns a generator, False (default) returns a list. |
+---------------------+----------+-------------------------------------------------------------------------------------------------------+
| file (dict) | No | A dictionary containing two values, "pattern" and "value", "pattern" and "value", |
| | | "pattern" contains the path of the file on the server, If it contains the mask "{value}", |
| | | then "value" is used to set the value of that mask |
+---------------------+----------+-----------+-------------------------------------------------------------------------------------------+
| parallelize | No | If True, the iterator will process rows in parallel. Default is False. |
+---------------------+----------+-----------+-------------------------------------------------------------------------------------------+
| num_threads | No | Number of threads to use if parallelize is True. Default is 10. |
+---------------------+----------+-----------+-------------------------------------------------------------------------------------------+
Return the list of files in a Directory
Example:
```yaml
FileList:
directory: /home/ubuntu/symbits/bayardad/files/job_advertising/bulk/
pattern: '*.csv'
iterate: true
```