Skip to content

Texplode

flowtask.components.tExplode

tExplode

tExplode(loop=None, job=None, stat=None, **kwargs)

Bases: FlowComponent

tExplode

Overview

    The tExplode class is a component for transforming a DataFrame by converting a column of lists or dictionaries
    into multiple rows. It supports options for dropping the original column after exploding, and for expanding
    nested dictionary structures into separate columns.

.. table:: Properties
:widths: auto

    +------------------+----------+-----------+-------------------------------------------------------------------------------+
    | Name             | Required | Summary                                                                                   |
    +------------------+----------+-----------+-------------------------------------------------------------------------------+
    | column           |   Yes    | The name of the column to explode into multiple rows.                                     |
    +------------------+----------+-----------+-------------------------------------------------------------------------------+
    | drop_original    |   No     | Boolean indicating if the original column should be dropped after exploding.              |
    +------------------+----------+-----------+-------------------------------------------------------------------------------+
    | explode_dataset  |   No     | Boolean specifying if nested dictionaries in the column should be expanded as new columns.|
    +------------------+----------+-----------+-------------------------------------------------------------------------------+
    | advanced_mode    |   No     | Boolean enabling enhanced features: preserve empty lists, propagate parent columns.      |
    +------------------+----------+-----------+-------------------------------------------------------------------------------+
    | propagate_columns|   No    | List of column names to propagate from parent to child rows (only in advanced_mode).   |
    +------------------+----------+-----------+-------------------------------------------------------------------------------+

Returns

    This component returns a DataFrame with the specified column exploded into multiple rows. If `explode_dataset` is
    set to True and the column contains dictionaries, these are expanded into new columns. Metrics on the row count
    after explosion are recorded, and any errors encountered during processing are logged and raised as exceptions.


Example:

```yaml
tExplode:
  column: reviews
  drop_original: false
  advanced_mode: true
  propagate_columns: ["id", "name"]
```