Filterrows¶
flowtask.components.FilterRows.FilterRows
¶
FilterRows
¶
Bases: FlowComponent
FilterRows
Overview
The FilterRows class is a component for removing or cleaning rows in a Pandas DataFrame based on specified criteria.
It supports various cleaning and filtering operations and allows for the saving of rejected rows to a file.
.. table:: Properties :widths: auto
+------------------+----------+-----------+--------------------------------------------------------------------------------------+
| Name | Required | Description |
+------------------+----------+-----------+--------------------------------------------------------------------------------------+
| fields | Yes | A dictionary defining the fields and corresponding filtering conditions to be applied. |
+------------------+----------+-----------+--------------------------------------------------------------------------------------+
| filter_conditions| Yes | A dictionary defining the filter conditions for transformations. |
+------------------+----------+-----------+--------------------------------------------------------------------------------------+
| _applied | No | A list to store the applied filters. |
+------------------+----------+-----------+--------------------------------------------------------------------------------------+
| multi | No | A flag indicating if multiple DataFrame transformations are supported, defaults to False. |
+------------------+----------+-----------+--------------------------------------------------------------------------------------+
Return
The methods in this class manage the filtering of rows in a Pandas DataFrame, including initialization, execution,
and result handling.
Example:
```yaml
FilterRows:
filter_conditions:
clean_empty:
columns:
- updated
drop_columns:
columns:
- legal_street_address_1
- legal_street_address_2
- work_location_address_1
- work_location_address_2
- birth_date
suppress:
columns:
- payroll_id
- reports_to_payroll_id
pattern: (\.0)
drop_empty: true
```