Python |使用。docx模块

Word文档包含包装在三个对象级别内的格式化文本。最低级别-运行对象、中间级别-段落对象和最高级别-文档对象。 因此,我们无法使用普通文本编辑器处理这些文档。但是,我们可以使用python docx模块在python中操作这些word文档。

null

1. 第一步是安装这个第三方模块python docx。您可以使用pip“pip install python docx”或从 在这里 .这是答案 Github存储库。

2. 安装后导入“docx”而不是“python docx”。 3. 使用“docx.Document”类开始使用word文档。

代码#1:

# import docx NOT python-docx
import docx
# create an instance of a word document
doc = docx.Document()
# add a heading of level 0 (largest heading)
doc.add_heading( 'Heading for the document' , 0 )
# add a paragraph and store
# the object in a variable
doc_para = doc.add_paragraph( 'Your paragraph goes here, ' )
# add a run i.e, style like
# bold, italic, underline, etc.
doc_para.add_run( 'hey there, bold here' ).bold = True
doc_para.add_run( ', and ' )
doc_para.add_run( 'these words are italic' ).italic = True
# add a page break to start a new page
doc.add_page_break()
# add a heading of level 2
doc.add_heading( 'Heading level 2' , 2 )
# pictures can also be added to our word document
# width is optional
doc.add_picture( 'path_to_picture' )
# now save the document to a location
doc.save( 'path_to_document' )


输出:

图片[1]-Python |使用。docx模块-yiteyi-C++库 图片[2]-Python |使用。docx模块-yiteyi-C++库 注意第二页的分页符。 代码#2: 现在,要打开word文档,请创建一个实例,同时传递文档的路径。

# import the Document class
# from the docx module
from docx import Document
# create an instance of a
# word document we want to open
doc = Document( 'path_to_the_document' )
# print the list of paragraphs in the document
print ( 'List of paragraph objects:->>>' )
print (doc.paragraphs)
# print the list of the runs
# in a specified paragraph
print ( 'List of runs objects in 1st paragraph:->>>' )
print (doc.paragraphs[ 0 ].runs)
# print the text in a paragraph
print ( 'Text in the 1st paragraph:->>>' )
print (doc.paragraphs[ 0 ].text)
# for printing the complete document
print ( 'The whole content of the document:->>>' )
for para in doc.paragraphs:
print (para.text)


输出:

List of paragraph objects:->>>
[<docx.text.paragraph.Paragraph object at 0x7f45b22dc128>,
<docx.text.paragraph.Paragraph object at 0x7f45b22dc5c0>,
<docx.text.paragraph.Paragraph object at 0x7f45b22dc0b8>,
<docx.text.paragraph.Paragraph object at 0x7f45b22dc198>,
<docx.text.paragraph.Paragraph object at 0x7f45b22dc0f0>]

List of runs objects in 1st paragraph:->>>
[<docx.text.run.Run object at 0x7f45b22dc198>]

Text in the 1st paragraph:->>>
Heading for the document

The whole content of the document:->>>

Heading for the document
Your paragraph goes here, hey there, bold here, and these words are italic


Heading level 2

参考: https://python-docx.readthedocs.io/en/latest/#user-向导 .

© 版权声明
THE END
喜欢就支持一下吧
点赞12 分享