Ungzip¶
flowtask.components.UnGzip
¶
UnGzip
¶
Bases: CompressSupport, FileCopy
UnGzip
Overview
The UnGzip class is a component for decompressing Gzip (.gz) files, including compressed tarballs (e.g., .tar.gz, .tar.bz2, .tar.xz).
This component extracts the specified Gzip or tarball file into a target directory and supports optional source file deletion
after extraction.
.. table:: Properties
:widths: auto
+----------------+----------+-----------+---------------------------------------------------------------+
| Name | Required | Summary |
+----------------+----------+-----------+---------------------------------------------------------------+
| filename | Yes | The path to the Gzip file to uncompress. |
+----------------+----------+-----------+---------------------------------------------------------------+
| directory | Yes | The target directory where files will be extracted. |
+----------------+----------+-----------+---------------------------------------------------------------+
| delete_source | No | Boolean indicating if the source file should be deleted post-extraction. |
+----------------+----------+-----------+---------------------------------------------------------------+
| extract | No | Dictionary specifying filenames to extract and/or output directory. |
+----------------+----------+-----------+---------------------------------------------------------------+
Returns
This component extracts files from a specified Gzip or tarball archive into the designated directory
and returns a list of paths to the extracted files. It tracks metrics for the output directory and the source
Gzip file. If configured, the original compressed file is deleted after extraction. Errors encountered during
extraction or directory creation are logged and raised as exceptions.
Example:
```yaml
UnGzip:
source:
directory: /home/ubuntu/symbits/mso/files/commissions_statements/pr/
filename: STATEMENT_STATEMENT-*.CSV.gz
destination:
directory: /home/ubuntu/symbits/mso/files/commissions_statements/pr/
delete_source: true
```