Skip to content

Pandasiterator

flowtask.components.PandasIterator

PandasIterator

PandasIterator(loop=None, job=None, stat=None, **kwargs)

Bases: IteratorBase

PandasIterator

Overview

    This component converts data to a pandas DataFrame in an iterator and processes each row.

.. table:: Properties
:widths: auto


+--------------+----------+-----------+------------------------------------------------------------+
| Name         | Required | Summary                                                                |
+--------------+----------+-----------+------------------------------------------------------------+
| columns      |   Yes    | Names of the columns that we are going to extract.                     |
+--------------+----------+-----------+------------------------------------------------------------+
| vars         |   Yes    | This attribute organizes names of the columns organized by id.         |
+--------------+----------+-----------+------------------------------------------------------------+
| parallelize  |   No     | If True, the iterator will process rows in parallel. Default is False. |
+--------------+----------+-----------+------------------------------------------------------------+
| num_threads  |   No     | Number of threads to use if parallelize is True. Default is 10.        |
+--------------+----------+-----------+------------------------------------------------------------+

Returns
-------
This component returns the processed pandas DataFrame after iterating
through the rows and applying the specified jobs.


Example:

```yaml
PandasIterator:
  columns:
  - formid
  - orgid
  vars:
    form: '{orgid}/{formid}'
```

createJob

createJob(target, params, row)

Create the Job Component.

run async

run()

Async Run Method.

start async

start(**kwargs)

Obtain Pandas Dataframe.