USING IRONPDF FOR PYTHON

How to Edit A PDF File in Python

Updated September 28, 2024
Share:

Introduction

Iron Software presents the IronPDF for Python library, a solution that revolutionizes the ease with which PDF editing tasks are performed in Python. Whether you need to insert signatures, add HTML footers, embed watermarks, include annotations, or edit PDF files, IronPDF for Python is your go-to tool. The library ensures your code remains readable, supports the creation of PDFs programmatically, facilitates straightforward debugging, and seamlessly deploys across all compatible platforms and environments.

This tutorial article will explore these extensive features with illustrative Python code examples and comprehensive explanations. By the end of this guide, you'll have a solid understanding of how to use IronPDF for Python for all your PDF editing needs.

How to Edit PDF Files in Python

  1. Install the Python PDF Library using pip Installer.
  2. Apply the License Key for the Python PDF Library.
  3. Load the PDF Document for editing.
  4. Edit the PDF document using different options such as Split, Copy Pages, and other PDF Operations.
  5. Save the modified file using the SaveAs function.

Edit Document Structure

Manipulate Pages

IronPDF simplifies the process of adding pages at specific positions, extracting specific pages or a range of pages, and removing pages from any PDF. It handles all the complex processes for you, making it easy to perform these tasks efficiently.

Add Pages

You can add pages to PDF documents by specifying the page content, size, and position. After making the desired changes, you can save the output PDF file using the SaveAs function.

from ironpdf import *

# Set a log path
Logger.EnableDebugging = True
Logger.LogFilePath = "Custom.log"
Logger.LoggingMode = Logger.LoggingModes.All

pdf = PdfDocument("C:\\Users\\Administrator\\Downloads\\Documents\\sample.pdf")
renderer = ChromePdfRenderer()
coverPagePdf = renderer.RenderHtmlAsPdf("<h1>Cover Page</h1><hr>")
pdf.PrependPdf(coverPagePdf)
pdf.SaveAs("report_with_cover.pdf")
PYTHON

Copy Pages

You can copy pages from one PDF document to another existing PDF file by specifying the page number and destination. Additionally, you have the option to create a new PDF file from the copied PDF pages. It is also possible to select one page or multiple pages from a single PDF file for copying.

from ironpdf import *

pdf = PdfDocument("C:\\Users\\Administrator\\Downloads\\Documents\\sample.pdf")
# Copy pages 5 to 7 and save them as a new document.
pdf.CopyPages(2, 4).SaveAs("report_highlight.pdf")
PYTHON

Delete Pages

You can delete pages from the input PDF file by specifying the page number.

from ironpdf import *

pdf = PdfDocument("report.pdf")
pdf.RemovePage(pdf.PageCount-1)
pdf.SaveAs("Report-Minus-1.pdf")
PYTHON

Merge and Split PDFs

IronPDF's user-friendly API makes it easy to combine multiple PDFs into one or break down an existing PDF into separate files.

Join Multiple Existing PDFs into a Single PDF Document

You can join multiple PDF documents into a single document by specifying the input PDF documents and output PDF documents.

from ironpdf import *

html_a = """<p> [PDF_A] </p>
            <p> [PDF_A] 1st Page </p>
            <div style='page-break-after: always;'></div>
            <p> [PDF_A] 2nd Page</p>"""

html_b = """<p> [PDF_B] </p>
            <p> [PDF_B] 1st Page </p>
            <div style='page-break-after: always;'></div>
            <p> [PDF_B] 2nd Page</p>"""

renderer = ChromePdfRenderer()

pdfdoc_a = renderer.RenderHtmlAsPdf(html_a)
pdfdoc_b = renderer.RenderHtmlAsPdf(html_b)
merged = PdfDocument.Merge(pdfdoc_a, pdfdoc_b)

merged.SaveAs("Merged.pdf")
PYTHON

Splitting a PDF and Extracting Pages

You can split a PDF document into multiple documents or extract specific pages from PDF files by specifying the input PDF document and output PDF documents or page numbers.

from ironpdf import *

html = """<p> Hello Iron </p>
          <p> This is 1st Page </p>
          <div style='page-break-after: always;'></div>
          <p> This is 2nd Page</p>
          <div style='page-break-after: always;'></div>
          <p> This is 3rd Page</p>"""

renderer = ChromePdfRenderer()
pdf = renderer.RenderHtmlAsPdf(html)

# take the first page
page1doc = pdf.CopyPage(0)
page1doc.SaveAs("Split1.pdf")

# take pages 2 & 3
page23doc = pdf.CopyPages(1, 2)
page23doc.SaveAs("Split2.pdf")
PYTHON

Edit Document Properties

Add and Use PDF Metadata

You can add and use PDF metadata with IronPDF for Python. This can be beneficial for adding copyright information, tracking changes, or simply making your PDF documents more searchable.

PDF metadata is a collection of data stored in a PDF document. This data can include the title, author, subject, keywords, creation date, and modification date of the PDF document. Additionally, it can include custom data that you add as per your requirements.

from ironpdf import *

# Open an Encrypted File, alternatively create a new PDF from Html
pdf = PdfDocument.FromFile("encrypted.pdf", "password")

# Edit file metadata
pdf.MetaData.Author = "Satoshi Nakamoto"
pdf.MetaData.Keywords = "SEO, Friendly"
pdf.MetaData.ModifiedDate = Now()
pdf.SaveAs("MetaData-Updated.pdf")
PYTHON

Digital Signatures

IronPDF allows you to digitally sign new or existing PDF files using .pfx and .p12 X509Certificate2 digital certificates. When a PDF is signed using this method, any modifications to the document would require validation with the certificate, ensuring the document's integrity.

You can find more guidance on generating a signing certificate for free with Adobe Reader on Adobe's website.

In addition to cryptographic signing, IronPDF also supports the use of a handwritten signature image or a company stamp image as an alternative way of signing the document.

from ironpdf import *

# Cryptographically sign an existing PDF in 1 line of code!
PdfSignature(r".\certificates\IronSoftware.p12", "123456").SignPdfFile("any.pdf")

##### Advanced example for more control #####

# Step 1. Create a PDF.
renderer = ChromePdfRenderer()
doc = renderer.RenderHtmlAsPdf("<h1>Testing 2048 bit digital security</h1>")

# Step 2. Create a signature.
# You may create a .pfx or .p12 PDF signing certificate using Adobe Acrobat Reader.
# Read https://helpx.adobe.com/acrobat/using/digital-ids.html
signature = PdfSignature(r"certificates\IronSoftware.pfx", "123456")

# Step 3. Optional signing options and a handwritten signature graphic.
signature.SigningContact = "support@ironsoftware.com"
signature.SigningLocation = "Chicago, USA"
signature.SigningReason = "To show how to sign a PDF"

# Step 4. Sign the PDF with the PdfSignature. Multiple signing certificates may be used.
doc.Sign(signature)

# Step 5. The PDF is not signed until saved to file, stream or byte array.
doc.SaveAs("signed.pdf")
PYTHON

PDF Attachments

IronPDF makes it very easy to add attachments to your PDF documents and remove them whenever you want. This means you can put extra files into your PDFs and take them out as needed, all with the help of IronPDF.

from ironpdf import *

# Instantiate the Renderer and create PdfDocument from HTML
renderer = ChromePdfRenderer()
my_pdf = renderer.RenderHtmlFileAsPdf("my-content.html")

# Open PDF document to be attached
pdf = PdfDocument.FromFile("new_sample.pdf")

# Here we can add an attachment with a name and a byte []
attachment1 = my_pdf.Attachments.AddAttachment("attachment_1", pdf.BinaryData)

# And here is an example of removing an attachment
my_pdf.Attachments.RemoveAttachment(attachment1)

my_pdf.SaveAs("my-content.pdf")
PYTHON

Compress PDFs

IronPDF has the feature to compress PDFs to help reduce their file size. One method is by decreasing the size of the images embedded in the PDF document using the CompressImages method.

Regarding image quality, with JPEG images, 100% quality gives you nearly no loss in the quality of the image, while 1% yields a very poor quality output. In general, an image quality of 90% or more is considered high quality. A medium-quality image lies between 80% and 90%, and a low-quality image ranges from 70% to 80%. If you go below 70%, the image quality significantly deteriorates, but this can help drastically reduce the overall file size of the PDF document.

It is recommended to try different quality percentages to find the right balance of quality and file size that suits your needs. Keep in mind that the noticeable loss in quality after reduction can vary depending on the type of image you're dealing with, as some images may lose clarity more than others.

from ironpdf import *

pdf = PdfDocument("document.pdf")

# Quality parameter can be 1-100, where 100 is 100% of original quality
pdf.CompressImages(60)
pdf.SaveAs("document_compressed.pdf")

# Second optional parameter can scale down the image resolution according to its visible size in the PDF document. Note that this may cause distortion with some image configurations
pdf.CompressImages(90, True)
pdf.SaveAs("document_scaled_compressed.pdf")
PYTHON

Editing PDF Content

Add Headers and Footers

Adding headers and footers to your PDF documents is straightforward with IronPDF. The software provides two distinct types of HeaderFooters: TextHeaderFooter and HtmlHeaderFooter. TextHeaderFooter is ideal for headers and footers that contain only text and might need to incorporate merge fields such as "{page} of {total-pages}". On the other hand, HtmlHeaderFooter is a more advanced option that can handle any HTML content and format it neatly, making it suitable for more complex headers and footers.

With IronPDF for Python, you can use the HtmlHeaderFooter feature to create HTML headers or footers for your PDF document from HTML. This means you can design your header or footer using HTML, and IronPDF for Python will convert it perfectly to fit your PDF, ensuring every detail is just right. So, if you have an HTML design for a header or footer, IronPDF for Python can apply it to your PDF document with precision.

from ironpdf import *
import os

# Instantiate Renderer
renderer = ChromePdfRenderer()

# Build a footer using html to style the text
# mergeable fields are
# {page} {total-pages} {url} {date} {time} {html-title} & {pdf-title}
renderer.RenderingOptions.HtmlFooter = HtmlHeaderFooter()
renderer.RenderingOptions.HtmlFooter.MaxHeight = 15  # millimeters
renderer.RenderingOptions.HtmlFooter.HtmlFragment = "<center><i>{page} of {total-pages}<i></center>"
renderer.RenderingOptions.HtmlFooter.DrawDividerLine = True

# Use sufficient MarginBottom to ensure that the HtmlFooter does not overlap with the main PDF page content.
renderer.RenderingOptions.MarginBottom = 25  # mm

# Build a header using an image asset
# Note the use of BaseUrl to set a relative path to the assets
renderer.RenderingOptions.HtmlHeader = HtmlHeaderFooter()
renderer.RenderingOptions.HtmlHeader.MaxHeight = 20  # millimeters
renderer.RenderingOptions.HtmlHeader.HtmlFragment = "<img src='iron.png'>"
renderer.RenderingOptions.HtmlHeader.BaseUrl = os.path.abspath("C:/Users/lyty1/OneDrive/Documents/IronPdfPythonNew")

# Use sufficient MarginTop to ensure that the HtmlHeader does not overlap with the main PDF page content.
renderer.RenderingOptions.MarginTop = 25  # mm
PYTHON
from ironpdf import *

# Initiate PDF Renderer
renderer = ChromePdfRenderer()

# Add a header to every page easily
renderer.RenderingOptions.FirstPageNumber = 1  # use 2 if a cover page will be appended
renderer.RenderingOptions.TextHeader.DrawDividerLine = True
renderer.RenderingOptions.TextHeader.CenterText = "{url}"
renderer.RenderingOptions.TextHeader.Font = FontTypes.Helvetica
renderer.RenderingOptions.TextHeader.FontSize = 12
renderer.RenderingOptions.MarginTop = 25  # create 25mm space for header

# Add a footer too
renderer.RenderingOptions.TextFooter.DrawDividerLine = True
renderer.RenderingOptions.TextFooter.Font = FontTypes.Arial
renderer.RenderingOptions.TextFooter.FontSize = 10
renderer.RenderingOptions.TextFooter.LeftText = "{date} {time}"
renderer.RenderingOptions.TextFooter.RightText = "{page} of {total-pages}"
renderer.RenderingOptions.MarginBottom = 25  # create 25mm space for footer

# Mergeable fields are
# {page} {total-pages} {url} {date} {time} {html-title} & {pdf-title}
PYTHON

Outlines and Bookmarks

An outline, also known as a "bookmark", is a tool that helps you quickly go to important pages in a PDF document. If you're using Adobe Acrobat Reader, you can see these bookmarks (which can be organized in a hierarchy) in the app's left sidebar.

IronPDF for Python library makes it even easier to work with bookmarks. It can automatically bring in any existing bookmarks from PDF documents. Plus, you can add more bookmarks, edit them, or arrange them in groups using IronPDF.

from ironpdf import *

# Create a new PDF or edit an existing document.
pdf = PdfDocument.FromFile("existing.pdf")

# Add bookmark
pdf.Bookmarks.AddBookMarkAtEnd("Author's Note", 2)
pdf.Bookmarks.AddBookMarkAtEnd("Table of Contents", 3)

# Store new bookmark in a variable to add nested bookmarks to
summaryBookmark = pdf.Bookmarks.AddBookMarkAtEnd("Summary", 17)

# Add a sub-bookmark within the summary
conclusionBookmark = summaryBookmark.Children.AddBookMarkAtStart("Conclusion", 18)

# Add another bookmark to end of highest-level bookmark list
pdf.Bookmarks.AddBookMarkAtEnd("References", 20)

pdf.SaveAs("existing.pdf")
PYTHON

Add and Edit Annotations

You can add and edit annotations to PDF documents with IronPDF for Python. Annotations can be used to highlight text, add comments, or create links. You can also edit existing annotations.

from ironpdf import *

# Load an existing PDF or create a new one
pdf = PdfDocument("existing.pdf")

# Create a TextAnnotation object
annotation = TextAnnotation()
annotation.Title = "This is the major title"
annotation.Subject = "This is a subtitle"
annotation.Contents = "This is the long 'sticky note' comment content..."
annotation.Icon = TextAnnotation.AnnotationIcon.Help
annotation.Opacity = 0.9
annotation.Printable = False
annotation.Hidden = False
annotation.OpenByDefault = True
annotation.ReadOnly = False
annotation.Rotateable = True

# Add the annotation to a specific page and location within the PDF
pdf.AddTextAnnotation(annotation, 1, 150, 250)

# Save the PDF
pdf.SaveAs("existing.pdf")
PYTHON

Add Backgrounds and Foregrounds

IronPDF for Python lets you add backgrounds and foregrounds to PDF documents. This can be useful for adding watermarks, creating custom templates, or simply making your PDF documents look more visually appealing. You can use images, colors, or gradients as backgrounds or foregrounds.

from ironpdf import *

# Instantiate Renderer
renderer = ChromePdfRenderer()

# Render a PDF from a URL
pdf = renderer.RenderUrlAsPdf("https://www.nuget.org/packages/IronPdf")

# Add a PDF as background
pdf.AddBackgroundPdf("MyBackground.pdf")

# Add a PDF as foreground overlay to the first page
pdf.AddForegroundOverlayPdfToPage(0, "MyForeground.pdf", 0)

# Save the merged PDF
pdf.SaveAs("Complete.pdf")
PYTHON

Stamping and Watermarking

IronPDF for Python allows you to stamp and watermark PDF documents. This can be useful for adding copyright information, preventing unauthorized copying, or simply making your PDF documents look more professional. You can stamp PDF documents with text, images, or watermarks. You can also control the size, position, and opacity of stamps and watermarks.

Apply Stamp onto a PDF

You can apply a stamp to a PDF document with IronPDF for Python. This can be useful for adding a logo, a signature, or other identifying information to a PDF document. You can choose the stamp type, position, and size. You can also set the stamp opacity.

from ironpdf import *

# Create an HtmlStamper object to stamp an image onto a PDF
stamper = HtmlStamper("<img src='https://ironpdf.com/img/products/ironpdf-logo-text-dotnet.svg'/>")
stamper.HorizontalAlignment = HorizontalAlignment.Center
stamper.VerticalAlignment = VerticalAlignment.Bottom
stamper.IsStampBehindContent = False
stamper.Opacity = 30

# Load an existing PDF document and apply the stamp to it
pdf = PdfDocument.FromFile("Sample.pdf")
pdf.ApplyStamp(stamper).SaveAs("stampedimage.pdf")
PYTHON

Add a Watermark to a PDF

IronPDF for Python lets you add a watermark to a PDF document. This can be useful for preventing unauthorized copying or simply making your PDF documents look more professional. You can choose the watermark text, font, size, and color. You can also set the watermark opacity.

from ironpdf import *

# Instantiate the Renderer and create PdfDocument from URL
renderer = ChromePdfRenderer()
pdf = renderer.RenderUrlAsPdf("https://www.nuget.org/packages/IronPdf")

# Apply watermark
pdf.ApplyWatermark("<h2 style='color:red'>SAMPLE</h2>", 30, VerticalAlignment.Middle, HorizontalAlignment.Center)

# Save your new PDF
pdf.SaveAs("Watermarked.pdf")
PYTHON

Using Forms in PDFs

You can create and edit forms in PDF documents with IronPDF for Python. This can be useful for collecting data from users or simply making your PDF documents more interactive. You can add form fields such as text boxes, checkboxes, and radio buttons. You can also collect form data from users.

Create and Edit Forms

from ironpdf import *

# Step 1.  Creating a PDF with editable forms from HTML using form and input tags
# Radio Button and Checkbox can also be implemented with input type 'radio' and 'checkbox'
form_html = """
<html>
    <body>
        <h2>Editable PDF Form</h2>
        <form>
            First name: <br> <input type='text' name='firstname' value=''> <br>
            Last name: <br> <input type='text' name='lastname' value=''> <br>
            <br>
            <p>Please specify your gender:</p>
            <input type='radio' id='female' name='gender' value= 'Female'>
            <label for='female'>Female</label> <br>
            <br>
            <input type='radio' id='male' name='gender' value='Male'>
            <label for='male'>Male</label> <br>
            <br>
            <input type='radio' id='non-binary/other' name='gender' value='Non-Binary / Other'>
            <label for='non-binary/other'>Non-Binary / Other</label>
            <br>

            <p>Please select all medical conditions that apply:</p>
            <input type='checkbox' id='condition1' name='Hypertension' value='Hypertension'>
            <label for='condition1'> Hypertension</label><br>
            <input type='checkbox' id='condition2' name='Heart Disease' value='Heart Disease'>
            <label for='condition2'> Heart Disease</label><br>
            <input type='checkbox' id='condition3' name='Stoke' value='Stoke'>
            <label for='condition3'> Stoke</label><br>
            <input type='checkbox' id='condition4' name='Diabetes' value='Diabetes'>
            <label for='condition4'> Diabetes</label><br>
            <input type='checkbox' id='condition5' name='Kidney Disease' value='Kidney Disease'>
            <label for='condition5'> Kidney Disease</label><br>
        </form>
    </body>
</html>
"""

# Instantiate Renderer
renderer = ChromePdfRenderer()
renderer.RenderingOptions.CreatePdfFormsFromHtml = True
renderer.RenderHtmlAsPdf(form_html).SaveAs("BasicForm.pdf")

# Step 2. Reading and Writing PDF form values.
form_document = PdfDocument.FromFile("BasicForm.pdf")

# Set and read the value of the "firstname" field
first_name_field = form_document.Form.GetFieldByName("firstname")
first_name_field.Value = "Minnie"
print("FirstNameField value: {}".format(first_name_field.Value))

# Set and read the value of the "lastname" field
last_name_field = form_document.Form.GetFieldByName("lastname")
last_name_field.Value = "Mouse"
print("LastNameField value: {}".format(last_name_field.Value))

form_document.SaveAs("FilledForm.pdf")
PYTHON

Conclusion

IronPDF for Python is a powerful Python PDF library that enables you to create, manipulate, and edit PDF documents from Python. With this library, manipulating PDF documents has become remarkably easy. It offers a wide range of features, including the ability to edit document structure, manipulate pages, merge and split PDFs, edit document properties, and add and use PDF metadata.

IronPDF for Python is user-friendly and can be seamlessly integrated into any Python project. It serves as a valuable tool for anyone who needs to work with PDF documents in Python. The IronPDF for Python license starts from $749. You can find more information on the IronPDF website.

< PREVIOUS
How to Generate PDF Forms in Python
NEXT >
How to Extract Specific Text From PDF in Python

Ready to get started? Version: 2024.9 just released

Free pip Install View Licenses >