ITextSharp HTML到PDF？

我想知道如果ITextSharp有能力将HTML转换为PDF。我将转换的所有东西都只是纯文本，但不幸的是，ITextSharp上几乎没有任何文档，所以我无法确定这是否对我来说是一个可行的解决scheme。

如果它不能这样做，有人可以指向我一些好的，免费的.net库，可以采取一个简单的纯文本HTML文件，并将其转换为PDF？

TIA。

几周前我遇到了同样的问题，这是我发现的结果。此方法将HTML快速转储为PDF。该文件将很可能需要一些格式调整。

private MemoryStream createPDF(string html) { MemoryStream msOutput = new MemoryStream(); TextReader reader = new StringReader(html); // step 1: creation of a document-object Document document = new Document(PageSize.A4, 30, 30, 30, 30); // step 2: // we create a writer that listens to the document // and directs a XML-stream to a file PdfWriter writer = PdfWriter.GetInstance(document, msOutput); // step 3: we create a worker parse the document HTMLWorker worker = new HTMLWorker(document); // step 4: we open document and start the worker on the document document.Open(); worker.StartDocument(); // step 5: parse the html into the document worker.Parse(reader); // step 6: close the document and the worker worker.EndDocument(); worker.Close(); document.Close(); return msOutput; }

做了一些挖掘后，我发现了一个很好的方法来完成我所需要的ITextSharp。

这里是一些示例代码，如果它将帮助其他人在未来：

 protected void Page_Load(object sender, EventArgs e) { Document document = new Document(); try { PdfWriter.GetInstance(document, new FileStream("c:\\my.pdf", FileMode.Create)); document.Open(); WebClient wc = new WebClient(); string htmlText = wc.DownloadString("http://localhost:59500/my.html"); Response.Write(htmlText); List<IElement> htmlarraylist = HTMLWorker.ParseToList(new StringReader(htmlText), null); for (int k = 0; k < htmlarraylist.Count; k++) { document.Add((IElement)htmlarraylist[k]); } document.Close(); } catch { } }

下面是我能够从版本5.4.2（从Nuget安装）工作，以返回从asp.net mvc控制器的PDF响应。如果需要的话，可以使用FileStream而不是MemoryStream来输出。

我把它发布在这里，因为它是一个完整的例子，当前iTextSharp用于HTML – > PDF转换（无视图像，我没有看过，因为我的用法不需要它）

它使用iTextSharp的XmlWorkerHelper，所以传入的hmtl必须是有效的XHTML，所以你可能需要根据你的input做一些修正。

 using iTextSharp.text.pdf; using iTextSharp.tool.xml; using System.IO; using System.Web.Mvc; namespace Sample.Web.Controllers { public class PdfConverterController : Controller { [ValidateInput(false)] [HttpPost] public ActionResult HtmlToPdf(string html) { html = @"<?xml version=""1.0"" encoding=""UTF-8""?> <!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Strict//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd""> <html xmlns=""http://www.w3.org/1999/xhtml"" xml:lang=""en"" lang=""en""> <head> <title>Minimal XHTML 1.0 Document with W3C DTD</title> </head> <body> " + html + "</body></html>"; var bytes = System.Text.Encoding.UTF8.GetBytes(html); using (var input = new MemoryStream(bytes)) { var output = new MemoryStream(); // this MemoryStream is closed by FileStreamResult var document = new iTextSharp.text.Document(iTextSharp.text.PageSize.LETTER, 50, 50, 50, 50); var writer = PdfWriter.GetInstance(document, output); writer.CloseStream = false; document.Open(); var xmlWorker = XMLWorkerHelper.GetInstance(); xmlWorker.ParseXHtml(writer, document, input, null); document.Close(); output.Position = 0; return new FileStreamResult(output, "application/pdf"); } } } }

如果我有这个声望，我会一口气说出mightymada的答案 – 我刚刚使用Pechkin实现了一个asp.net HTML解决scheme。结果是美好的。

Pechkin有一个nuget包，但正如上面的海报在他的博客中提到的（ http://codeutil.wordpress.com/2013/09/16/convert-html-to-pdf/ – 我希望她不介意我重新发布它），这个分支已经修复了一个内存泄漏：

https://github.com/tuespetre/Pechkin

上面的博客具体说明如何包含这个包（这是一个32位的DLL，需要.net4）。这是我的代码。传入的HTML实际上是通过HTML敏捷包（我自动化发票生成）组装的：

 public static byte[] PechkinPdf(string html) { //Transform the HTML into PDF var pechkin = Factory.Create(new GlobalConfig()); var pdf = pechkin.Convert(new ObjectConfig() .SetLoadImages(true).SetZoomFactor(1.5) .SetPrintBackground(true) .SetScreenMediaType(true) .SetCreateExternalLinks(true), html); //Return the PDF file return pdf; }

再一次，谢谢你mightymada – 你的答案是太棒了。

我更喜欢使用另一个名为Pechkin的库，因为它能够转换非平凡的HTML（也有CSS类）。这是可能的，因为这个库使用WebKit布局引擎，浏览器也使用Chrome和Safari。

我在博客上详细介绍了我与Pechkin的经验： http ://codeutil.wordpress.com/2013/09/16/convert-html-to-pdf/

上面的代码肯定有助于将HTML转换为PDF，但如果HTML代码具有带相对path的IMG标签，将会失败。 iTextSharp库不会自动将相对path转换为绝对path。

我尝试了上面的代码，并添加了代码来照顾IMG标签。

您可以在这里find代码供您参考： http : //www.am22tech.com/html-to-pdf/

它能够将HTML文件转换为pdf。

转换所需的命名空间是：

 using iTextSharp.text; using iTextSharp.text.pdf;

并进行转换和下载文件：

 // Create a byte array that will eventually hold our final PDF Byte[] bytes; // Boilerplate iTextSharp setup here // Create a stream that we can write to, in this case a MemoryStream using (var ms = new MemoryStream()) { // Create an iTextSharp Document which is an abstraction of a PDF but **NOT** a PDF using (var doc = new Document()) { // Create a writer that's bound to our PDF abstraction and our stream using (var writer = PdfWriter.GetInstance(doc, ms)) { // Open the document for writing doc.Open(); string finalHtml = string.Empty; // Read your html by database or file here and store it into finalHtml eg a string // XMLWorker also reads from a TextReader and not directly from a string using (var srHtml = new StringReader(finalHtml)) { // Parse the HTML iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, srHtml); } doc.Close(); } } // After all of the PDF "stuff" above is done and closed but **before** we // close the MemoryStream, grab all of the active bytes from the stream bytes = ms.ToArray(); } // Clear the response Response.Clear(); MemoryStream mstream = new MemoryStream(bytes); // Define response content type Response.ContentType = "application/pdf"; // Give the name of file of pdf and add in to header Response.AddHeader("content-disposition", "attachment;filename=invoice.pdf"); Response.Buffer = true; mstream.WriteTo(Response.OutputStream); Response.End();

如果您在html服务器端将html转换为pdf，则可以使用Rotativa：

 Install-Package Rotativa

这是基于wkhtmltopdf，但它比iTextSharp具有更好的CSS支持，并与MVC（这是最常用的）集成非常简单，因为您可以简单地返回视图为PDF格式：

 public ActionResult GetPdf() { //... return new ViewAsPdf(model);// and you are done! }

ITextSharp HTML到PDF？

使用ASP.NET MVC与多个参数进行路由

无法加载文件或程序集“DotNetOpenAuth.Core

将DTO映射到域对象的最佳实践？

如何更改winform DataGridview标题的颜色？

为什么我不能在C＃中抽象静态方法？

单声道在树莓派

为什么是String.Format静态？

文件按文件名模式存在

写入Windows应用程序事件日志，无需注册事件源

C＃3.0的自动属性 - 有用或没有？

ITextSharp HTML到PDF？

使用ASP.NET MVC与多个参数进行路由

无法加载文件或程序集“DotNetOpenAuth.Core

将DTO映射到域对象的最佳实践？

如何更改winform DataGridview标题的颜色？

为什么我不能在C＃中抽象静态方法？

单声道在树莓派

为什么是String.Format静态？

文件按文件名模式存在

写入Windows应用程序事件日志，无需注册事件源

C＃3.0的自动属性 ​​- 有用或没有？

C＃3.0的自动属性 - 有用或没有？