Skip to content

Openwithbase

flowtask.components.OpenWithBase

OpenWithBase

OpenWithBase(loop=None, job=None, stat=None, **kwargs)

Bases: FlowComponent

OpenWithBase

Overview

    Abstract Component for Opening Files into DataFrames.
    Supports various file types such as CSV, Excel, and XML.

.. table:: Properties
:widths: auto


+--------------+----------+-----------+-------------------------------------------------------+
| Name         | Required | Summary                                                           |
+--------------+----------+-----------+-------------------------------------------------------+
| directory    |   No     | The directory where the files are located.                        |
+--------------+----------+-----------+-------------------------------------------------------+
| filename     |   No     | The name of the file to be opened. Supports glob patterns.        |
+--------------+----------+-----------+-------------------------------------------------------+
| file         |   No     | A dictionary containing the file patterns to be used.             |
+--------------+----------+-----------+-------------------------------------------------------+
| mime         |   No     | The MIME type of the file. Default is "text/csv".                 |
+--------------+----------+-----------+-------------------------------------------------------+
| separator    |   No     | The delimiter to be used in CSV files. Default is ",".            |
+--------------+----------+-----------+-------------------------------------------------------+
| encoding     |   No     | The encoding of the file.                                         |
+--------------+----------+-----------+-------------------------------------------------------+
| datatypes    |   No     | Specifies the datatypes to be used for columns.                   |
+--------------+----------+-----------+-------------------------------------------------------+
| parse_dates  |   No     | Specifies columns to be parsed as dates.                          |
+--------------+----------+-----------+-------------------------------------------------------+
| filter_nan   |   No     | If True, filters out NaN values. Default is True.                 |
+--------------+----------+-----------+-------------------------------------------------------+
| na_values    |   No     | List of strings to recognize as NaN. Default is ["NULL", "TBD"].  |
+--------------+----------+-----------+-------------------------------------------------------+
| clean_nat    |   No     | If True, cleans Not-A-Time (NaT) values.                          |
+--------------+----------+-----------+-------------------------------------------------------+
| no_multi     |   No     | If True, disables multi-threading.                                |
+--------------+----------+-----------+-------------------------------------------------------+
| flavor       |   No     | Specifies the database flavor to be used for column information.  |
+--------------+----------+-----------+-------------------------------------------------------+
| force_map    |   No     | If True, forces the use of a mapping file.                        |
+--------------+----------+-----------+-------------------------------------------------------+
| skipcols     |   No     | List of columns to be skipped.                                    |
+--------------+----------+-----------+-------------------------------------------------------+
| pd_args      |   No     | Additional arguments to be passed to pandas read functions.       |
+--------------+----------+-----------+-------------------------------------------------------+

Returns

This component opens files and prepares them for further processing. The actual return type depends on the concrete
implementation, but typically it returns a list of filenames or file data.