Skip to content

Extracthtml

flowtask.components.ExtractHTML

ExtractHTML

ExtractHTML(loop=None, job=None, stat=None, **kwargs)

Bases: FlowComponent, PandasDataframe

ExtractHTML

Overview:
Extract HTML using XPATH or BS CSS Selectors.



Example:

```yaml
ExtractHTML:
  custom_parser: trustpilot_reviews
  as_dataframe: true
```

get_soup

get_soup(content, parser='html.parser')

Get a BeautifulSoup Object.