Docx¶
flowtask.components.LangchainLoader.loaders.docx
¶
MSWordLoader
¶
MSWordLoader(tokenizer=None, text_splitter=None, summarizer=None, markdown_splitter=None, source_type='file', doctype='document', device=None, cuda_number=0, llm=None, **kwargs)
Bases: AbstractLoader
Load Microsoft Docx as Langchain Documents.
extract_text
¶
Extract text from a docx file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
The source of the data. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
The extracted text. |