using IronPdf; // Disable local disk access or cross-origin requests Installation.EnableWebSecurity = true; // Instantiate Renderer var renderer = new ChromePdfRenderer(); // Create a PDF from a HTML string using C# var pdf = renderer.RenderHtmlAsPdf("<h1>Hello World</h1>"); // Export to a file or Stream pdf.SaveAs("output.pdf"); // Advanced Example with HTML Assets // Load external html assets: Images, CSS and JavaScript. // An optional BasePath 'C:\site\assets\' is set as the file location to load assets from var myAdvancedPdf = renderer.RenderHtmlAsPdf("<img src='icons/iron.png'>", @"C:\site\assets\"); myAdvancedPdf.SaveAs("html-with-assets.pdf");

产品比较

itext7从PDF中提取文本与IronPDF（代码示例教程）

Name: IronPDF
Brand: Iron Software
Availability: InStock
Rating: 4.87 (307 reviews)

奇佩戈-卡琳达

2023年二月2日

您的企业在每年的 PDF 安全和合规订阅上花费过多。考虑 IronSecureDoc，提供一站式付款的解决方案来管理 SaaS 服务，如数字签名、遮蔽、加密和保护。探索 IronSecureDoc 文档

在本教程中，我们将学习如何使用两个不同的工具在C#中从PDF（便携式文档格式）文档中读取数据，并附有示例。

网上有很多可以从 PDF 中提取文本和图像的解析器库/阅读器。我们将利用两个迄今为止最有用、最好的相关服务图书馆从 PDF 文件中提取信息。我们还将对这两个图书馆进行比较，看看哪一个更好。

我们将比较[iText 7](https://itextpdf.com/products/itext-7/itext-7-core" target="_blank" rel="nofollow noopener noreferrer)和IronPDF。在继续介绍之前，我们先介绍一下这两个图书馆。

iText 7

iText 7 库是 iTextSharp 的最新版本。它既可用于 .NET 应用程序，也可用于 Java 应用程序。它配备了文档引擎（如Adobe Acrobat Reader）、高低级编程能力、事件监听器和PDF编辑功能。 iText 7 可以无差错地创建、编辑和增强 PDF 文档页面。其他功能包括添加密码、创建编码策略以及将权限选项保存到 PDF 文档中。它还用于添加或更改内容或画布图像，附加 PDF 元素[字典等]，创建水印和书签，更改字体大小以及签署敏感数据。

通过 iText 7，我们可以在 .NET 中为网络、移动、桌面、内核或云应用程序构建定制的 PDF 处理应用程序。

IronPDF

IronPDF 是 Iron Software 开发的一个库，可帮助 C# 和 Java 软件工程师创建、编辑和提取 PDF 内容。它通常用于从 HTML、网页或图像生成 PDF。它用于读取 PDF 文件并提取其文本。其他功能包括添加页眉/页脚、签名、附件、密码和安全问题。它具有多线程和异步功能，可全面优化性能。

IronPDF 支持跨平台，兼容 .NET 5、.NET 6 和 .NET 7、.NET Core、Standard 和 Framework。它还兼容 Windows、macOS、Linux、Docker、Azure 和 AWS。

现在，让我们来看看它们的演示。

使用 iText 7 从 PDF 文件中提取文本

我们将使用以下 PDF 文件来提取 PDF 中的文本。

IronPDF

使用 iText 7 编写以下提取文本的源代码。

//assign PDF location to a string and create new StringBuilder...
string pdfPath = @"D:/TestDocument.pdf";
 var pageText = new StringBuilder();
//read PDF using new PdfDocument and new PdfReader...
 using (PdfDocument document = new PdfDocument(new PdfReader(pdfPath)))
    {
      var pageNumbers = document.GetNumberOfPages();
       for (int page = 1; page <= pageNumbers; page++)
        {
//new LocationTextExtractionStrategy creates a new text extraction renderer
    LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
     PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
     parser.ProcessPageContent(document.GetFirstPage());
     pageText.Append(strategy.GetResultantText());
         }
            Console.WriteLine(pageText.ToString());
     }

//assign PDF location to a string and create new StringBuilder...
string pdfPath = @"D:/TestDocument.pdf";
 var pageText = new StringBuilder();
//read PDF using new PdfDocument and new PdfReader...
 using (PdfDocument document = new PdfDocument(new PdfReader(pdfPath)))
    {
      var pageNumbers = document.GetNumberOfPages();
       for (int page = 1; page <= pageNumbers; page++)
        {
//new LocationTextExtractionStrategy creates a new text extraction renderer
    LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
     PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
     parser.ProcessPageContent(document.GetFirstPage());
     pageText.Append(strategy.GetResultantText());
         }
            Console.WriteLine(pageText.ToString());
     }

'assign PDF location to a string and create new StringBuilder...
Dim pdfPath As String = "D:/TestDocument.pdf"
 Dim pageText = New StringBuilder()
'read PDF using new PdfDocument and new PdfReader...
 Using document As New PdfDocument(New PdfReader(pdfPath))
	  Dim pageNumbers = document.GetNumberOfPages()
	   For page As Integer = 1 To pageNumbers
'new LocationTextExtractionStrategy creates a new text extraction renderer
	Dim strategy As New LocationTextExtractionStrategy()
	 Dim parser As New PdfCanvasProcessor(strategy)
	 parser.ProcessPageContent(document.GetFirstPage())
	 pageText.Append(strategy.GetResultantText())
	   Next page
			Console.WriteLine(pageText.ToString())
 End Using

$vbLabelText $csharpLabel

提取文本输出

现在，让我们使用 IronPDF 从 PDF 中提取文本。

使用 IronPDF 从 PDF 文档中提取文本

以下源代码演示了使用 IronPDF 从 PDF 中提取文本的示例。

var pdf = PdfDocument.FromFile(@"D:/TestDocument.pdf");
string text = pdf.ExtractAllText();
Console.WriteLine(text);

var pdf = PdfDocument.FromFile(@"D:/TestDocument.pdf");
string text = pdf.ExtractAllText();
Console.WriteLine(text);

Dim pdf = PdfDocument.FromFile("D:/TestDocument.pdf")
Dim text As String = pdf.ExtractAllText()
Console.WriteLine(text)

$vbLabelText $csharpLabel

使用 IronPDF 提取文本

比较

使用 IronPDF，从 PDF 中提取文本只需两行。而在 iText 7 中，我们只需编写大约 10 行代码就能完成同样的任务。

IronPDF 开箱即提供便捷的文本提取方法；但 iText 7 要求我们编写自己的逻辑来完成同样的任务。

IronPDF 在性能和代码可读性方面都很高效。

就准确性而言，这两个库是平等的，都能提供 100% 的准确输出。

结论

iText 7 仅可用于[商业用途](https://itextpdf.com/how-buy" target="_blank" rel="nofollow noopener noreferrer)。 IronPDF 免费用于开发，并且还为商业用途提供免费试用。

要对IronPDF和iText 7进行更深入的比较，请阅读这篇关于IronPDF和iText 7的博客文章。

奇佩戈-卡琳达

立即与工程团队聊天

软件工程师

Chipego 拥有出色的倾听技巧，这帮助他理解客户问题并提供智能解决方案。他在 2023 年加入 Iron Software 团队，此前他获得了信息技术学士学位。IronPDF 和 IronOCR 是 Chipego 主要专注的两个产品，但他对所有产品的了解每天都在增长，因为他不断找到支持客户的新方法。他喜欢 Iron Software 的合作氛围，公司各地的团队成员贡献他们丰富的经验，以提供有效的创新解决方案。当 Chipego 离开办公桌时，你经常可以发现他在看书或踢足球。

< 前一页
带有 IronPDF 的产品比较

下一步 >
IronPDF和PDFium.NET比较