pdftk 图像转pdf
PDF files are one of the most common ways of sharing documents online. Whether we need to pass our clients’ documents to third-party service providers like banks or insurance companies, or just to send a CV to an employer, using a PDF document is frequently the first option.
PDF文件是在线共享文档的最常用方法之一。 无论是需要将客户的文件传递给银行或保险公司等第三方服务提供商,还是仅仅将简历发送给雇主,使用PDF文档通常都是首选。
PDF files can transfer plain/formatted text, images, hyperlinks, and even fillable forms. In this tutorial, we’re going to see how we can fill out PDF forms using PHP and a great PDF manipulation tool called
PDFtk Server
.
PDF文件可以传输纯文本/格式化的文本,图像,超链接,甚至可以填写表格。 在本教程中,我们将了解如何使用PHP和称为
PDFtk Server
的出色PDF操作工具来填写PDF表单。
To keep things simple enough, we’ll refer to
PDFtk Server
as
PDFtk
throughout the rest of the article.
为了使事情变得简单,在本文的其余部分中,我们将
PDFtk Server
称为
PDFtk
。
安装
(
Installation
)
We’ll use
Homestead Improved
for our development environment, as usual.
像往常一样,我们将
Homestead改良版
用于我们的开发环境。
Once the VM is booted up, and we’ve managed to ssh into the system with
vagrant ssh
, we can start installing
PDFtk
using
apt-get
:
虚拟机启动后,我们已经设法使用
vagrant ssh
ssh进入系统,我们可以开始使用
apt-get
安装
PDFtk
:
sudo apt-get install pdftk
To check if it works, we can run the following command:
要检查它是否有效,我们可以运行以下命令:
pdftk --version
The output should be similar to:
输出应类似于:
这个怎么运作
(
How It Works
)
PDFtk provides a wide variety of features for manipulating PDF documents, from merging and splitting pages to filling out PDF forms, or even applying watermarks. This article focuses on using PDFtk to fill out a standard PDF form using PHP.
PDFtk提供了多种功能来处理PDF文档,从合并和拆分页面到填写PDF表单,甚至应用水印。 本文重点介绍使用PDFtk来使用PHP填写标准PDF表单。
PDFtk uses
FDF
files for manipulating PDF forms, but what is an FDF file?
PDFtk使用
FDF
文件处理PDF表单,但是什么是FDF文件?
FDF or
Form Data File
is a plain-text file, which can store form data in a much simpler structure than PDF files.
FDF或
表单数据文件
是纯文本文件,可以用比PDF文件简单得多的结构存储表单数据。
Simply put, we need to generate an FDF file from
user submitted data
, and merge it with the original PDF file using PDFtk’s commands.
简而言之,我们需要根据
用户提交的数据
生成FDF文件,并使用PDFtk的命令将其与原始PDF文件合并。
FDF文件中的内容
(
What Is inside an FDF File
)
The structure of an FDF file is composed of three parts: the header, the content and the footer:
FDF文件的结构由三部分组成:标头,内容和页脚:
%FDF-1.2
1 0 obj<</FDF<< /Fields[
We don’t need to worry about this part, since it’s what we’re going to use for all the FDF files.
我们不必担心这部分,因为这是我们将用于所有FDF文件的内容。
内容
(
Content
)
Some sample FDF content is as follows:
一些示例FDF内容如下:
<< /T (first_name) /V (John)
<< /T (last_name) /V (Smith)
<< /T (occupation) /V (Teacher)>>
<< /T (age) /V (45)>>
<< /T (gender) /V (male)>>
The content section may seem confusing at first, but don’t worry, we’ll get to that shortly.
内容部分乍一看似乎令人困惑,但请放心,我们将在稍后进行介绍。
endobj
trailer
<</Root 1 0 R>>
%%EOF
This section is also the same for all of our FDF files.
对于我们所有的FDF文件,此部分也相同。
The content section contains the form data entries, each following a standard pattern. Each line represents one field in the form. They begin with the form element’s name prefixed with
/T
, which indicates the title. The second part is the element’s value prefixed with
/V
indicating the value:
内容部分包含表单数据条目,每个条目均遵循标准模式。 每行代表表格中的一个字段。 它们以表单元素的名称开头
/T
开头,该标题表示标题。 第二部分是以
/V
开头的元素值,指示该值:
<< /T(FIELD_NAME)/V(FIELD_VALUE) >>
To create an FDF file, we will need to know the field names in the PDF form. If we have access to a Mac or Windows machine, we can open the form in Adobe Acrobat Pro and see the fields’ properties.
要创建FDF文件,我们需要知道PDF表单中的字段名称。 如果可以访问Mac或Windows计算机,则可以在Adobe Acrobat Pro中打开表单,然后查看字段的属性。
Alternatively, we can use PDFtk’s
dump_data_fields
command to extract the fields information from the file:
另外,我们可以使用PDFtk的
dump_data_fields
命令从文件中提取字段信息:
pdftk path/to/the/form.pdf dump_data_fields > field_names.txt
As a result, PDFtk will save the result in the
field_names.txt
file. Below is an example of the extracted data:
结果,PDFtk会将结果保存在
field_names.txt
文件中。 以下是提取的数据的示例:
FieldType: Text
FieldName: first_name
FieldFlags: 0
FieldJustification: Left
FieldType: Text
FieldName: last_name
FieldFlags: 0
FieldJustification: Left
FieldType: Text
FieldName: occupation
FieldFlags: 0
FieldJustification: Center
FieldType: Button
FieldName: gender
FieldFlags: 0
FieldJustification: Center
There are several properties for each field in the form. We can modify these properties in Adobe Acrobat Pro. For example, we can change text alignments, font sizes or even the text color.
表单中每个字段都有几个属性。 我们可以在Adobe Acrobat Pro中修改这些属性。 例如,我们可以更改文本对齐方式,字体大小甚至文本颜色。
PDFtk和PHP
(
PDFtk and PHP
)
We can use PHP’s
exec()
function to bring PDFtk to the PHP environment. Suppose we have a simple PDF form with four text boxes and a group of two radio buttons:
我们可以使用PHP的
exec()
函数将PDFtk引入PHP环境。 假设我们有一个简单的PDF表单,其中包含四个文本框和一组两个单选按钮:
Let’s write a simple script to fill out this form:
让我们编写一个简单的脚本来填写此表单:
// Form data:
$fname = 'John';
$lname = 'Smith';
$occupation = 'Teacher';
$age = '45';
$gender = 'male';
// FDF header section
$fdf_header = <<<FDF
%FDF-1.2
%,,oe"
1 0 obj
/FDF << /Fields [
// FDF footer section
$fdf_footer = <<<FDF
endobj
trailer
<</Root 1 0 R>>
%%EOF;
// FDF content section
$fdf_content = "<</T(first_name)/V({$fname})>>";
$fdf_content .= "<</T(last_name)/V({$lname})>>";
$fdf_content .= "<</T(occupation)/V({$occupation})>>";
$fdf_content .= "<</T(age)/V({$age})>>";
$fdf_content .= "<</T(gender)/V({$gender})>>";
$content = $fdf_header . $fdf_content , $fdf_footer;
// Creating a temporary file for our FDF file.
$FDFfile = tempnam(sys_get_temp_dir(), gethostname());
file_put_contents($FDFfile, $content);
// Merging the FDF file with the raw PDF form
exec("pdftk form.pdf fill_form $FDFfile output.pdf");
// Removing the FDF file as we don't need it anymore
unlink($FDFfile);
Okay, let’s break the script down. First, we define the values that we’re going to write to the form. We can fetch these values from a database table, a JSON API response, or even hardcode them inside the script.
好的,让我们分解一下脚本。 首先,我们定义要写入表单的值。 我们可以从数据库表,JSON API响应中获取这些值,甚至可以在脚本中对其进行硬编码。
Next, we create an FDF file based on the pattern we discussed earlier. We used the PHP’s
tempnam
function to create a temporary file for storing the FDF content. The reason is that PDFtk only relies on physical files to perform the operations, especially when filling out forms.
接下来,我们根据前面讨论的模式创建一个FDF文件。 我们使用PHP的
tempnam
函数创建一个临时文件来存储FDF内容。 原因是PDFtk仅依靠物理文件来执行操作,尤其是在填写表单时。
Finally, we called PDFtk’s
fill_form
command using PHP’s
exec
function.
fill_form
merges
the FDF file with the raw PDF form. According to the script, our PDF file should be in the same directory as our PHP script.
最后,我们使用PHP的
exec
函数调用PDFtk的
fill_form
命令。
fill_form
将
FDF文件与原始PDF表单
合并
。 根据脚本,我们的PDF文件应该与我们PHP脚本位于同一目录中。
Save the PHP file above in the web root directory as
pdftk.php
. The output will be a new PDF file with all the fields filled out with our data.
将上面PHP文件保存在Web根目录中为
pdftk.php
。 输出将是一个新的PDF文件,所有字段均填充有我们的数据。
It’s as simple as that!
就这么简单!
展平输出文件
(
Flattening the Output File
)
We can also flatten the output file to prevent future modifications. This is possible by passing
flatten
as a parameter to the
fill_form
command.
我们还可以展平输出文件,以防止将来进行修改。 通过将
flatten
作为参数传递给
fill_form
命令,这是可能的。
exec("pdftk path/to/form.pdf fill_form $FDFfile output path/to/output.pdf flatten");
下载输出文件
(
Downloading the Output File
)
Instead of storing the file on the disk, we can force download the output file by sending the file’s content along with the required headers to the output buffer:
除了将文件存储在磁盘上之外,我们还可以通过将文件的内容以及所需的标头发送到输出缓冲区来强制下载输出文件:
// ...
exec("pdftk path/to/form.pdf fill_form $FDFfile output output.pdf flatten");
// Force Download the output file
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=' . 'path/to/output.pdf' );
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Length: ' . filesize('output.pdf'));
readfile('output.pdf');
exit;
If we run the script in the browser, the output file will be downloaded to our machine.
如果我们在浏览器中运行脚本,则输出文件将下载到我们的计算机上。
Now that we have a basic understanding of how PDFtk works, we can start building a PHP class around it, to make our service more reusable.
现在我们对PDFtk的工作原理有了基本的了解,我们可以开始围绕它构建PHP类,以使我们的服务更可重用。
在PDFtk周围创建包装器类
(
Creating a Wrapper Class around PDFtk
)
The usage of our final product should be as simple as the following code:
我们最终产品的用法应与以下代码一样简单:
// Data to be written to the PDF form
$data = [
'first_name' => 'John',
'last_name' => 'Smith',
'occupation' => 'Teacher',
'age' => '45',
'gender' => 'male'
$pdf = new pdfForm('form.pdf', $data);
$pdf->flatten()
->save('outputs/form-filled.pdf')
->download();
We’ll create a new file in the web root directory and name it
PdfForm.php
. Let’s name the class
PdfForm
as well.
我们将在Web根目录中创建一个新文件,并将其命名为
PdfForm.php
。 我们也将类
PdfForm
为
PdfForm
。
从类属性开始
(
Starting with Class Properties
)
First of all, we need to declare some private properties for the class:
首先,我们需要为该类声明一些私有属性:
class PdfForm
* Path to raw PDF form
* @var string
private $pdfurl;
* Form data
* @var array
private $data;
* Path to filled PDF form
* @var string
private $output;
* Flag for flattening the file
* @var string
private $flatten;
// ...
构造函数
(
The Constructor
)
Let’s write the constructor:
让我们编写构造函数:
// ...
public function __construct($pdfurl, $data)
$this->pdfurl = $pdfurl;
$this->data = $data;
The constructor doesn’t do anything complicated. It assigns the PDF path and the form data to their respective properties.
构造函数没有做任何复杂的事情。 它将PDF路径和表单数据分配给它们各自的属性。
处理临时文件
(
Handling Temporary Files
)
Since PDFtk uses physical files to perform its tasks, we usually need to generate temporary files during the process. To keep the code clean and reusable, let’s write a method to create temporary files:
由于PDFtk使用物理文件执行其任务,因此我们通常需要在此过程中生成临时文件。 为了保持代码的清洁和可重用,让我们编写一个创建临时文件的方法:
// ...
private function tmpfile()
return tempnam(sys_get_temp_dir(), gethostname());
This method creates a file using PHP’s
tempnum
function. We passed two parameters to the function, the first being the path to the
tmp
directory fetched via the
sys_get_temp_dir
function, and the second parameter being a prefix for the filename, just to make sure the filename would be as unique as possible across different hosts. This will prefix the file name with our hostname. Finally, the method returns the file path to the caller.
此方法使用PHP的
tempnum
函数创建文件。 我们向函数传递了两个参数,第一个参数是通过
sys_get_temp_dir
函数获取的
tmp
目录的路径,第二个参数是文件名的前缀,只是为了确保文件名在不同主机之间尽可能地唯一。 这将在文件名前面加上我们的主机名。 最后,该方法将文件路径返回给调用方。
As discussed earlier, to create an FDF file, we need to know the name of the form elements in advance. This is possible either by opening the form in
Adobe Acrobat Pro
or by using PDFtk’s
dump_data_fields
command.
如前所述,要创建FDF文件,我们需要事先知道表单元素的名称。 这可以通过在
Adobe Acrobat Pro中
打开表格或使用PDFtk的
dump_data_fields
命令来实现。
To make things easier for the developer, let’s write a method which prints out the fields’ information to the screen. Although we won’t use this method in the PDF generation process, it can be useful when we’re not aware of the field names. Another use case would be to parse the fields’ meta data, to make the writing process more dynamic.
为了使开发人员更轻松,让我们编写一个将字段信息打印到屏幕上的方法。 尽管我们不会在PDF生成过程中使用此方法,但是当我们不知道字段名称时,它可能会很有用。 另一个用例是解析字段的元数据,以使写入过程更加动态。
// ...
public function fields($pretty = false)
$tmp = $this->tmpfile();
exec("pdftk {$this->pdfurl} dump_data_fields > {$tmp}");
$con = file_get_contents($tmp);
unlink($tmp);
return $pretty == true ? nl2br($con) : $con;
The above method runs PDFtk’s
dump_data_fields
command, writes the output to a file and returns its content.
上面的方法运行PDFtk的
dump_data_fields
命令,将输出写入文件并返回其内容。
We also set an optional argument for beautifying the output. As a result we’ll be able to get a human friendly output by passing
true
to the method. If we need to parse the output or run a regular expression against it, we should call it without arguments.
我们还设置了一个可选参数来美化输出。 结果,通过将
true
传递给该方法,我们将能够获得人性化的输出。 如果需要解析输出或对输出运行正则表达式,则应不带参数调用它。
创建FDF文件
(
Creating the FDF File
)
In the next step, we will write a method for generating the FDF file:
在下一步中,我们将编写一种生成FDF文件的方法:
// ...
public function makeFdf($data)
$fdf = '%FDF-1.2
1 0 obj<</FDF<< /Fields[';
foreach ($data as $key => $value) {
$fdf .= '<</T(' . $key . ')/V(' . $value . ')>>';
$fdf .= "] >> >>
endobj
trailer
<</Root 1 0 R>>
%%EOF";
$fdf_file = $this->tmpfile();
file_put_contents($fdf_file, $fdf);
return $fdf_file;
The
makeFdf()
method iterates over the
$data
array items to generate the entries based on the FDF standard pattern. Finally, it puts the content in a temporary file using the
file_put_contents
function, and returns the file path to the caller.
makeFdf()
方法遍历
$data
数组项以基于FDF标准模式生成条目。 最后,它使用
file_put_contents
函数将内容放入临时文件中,并将文件路径返回给调用方。
展平文件
(
Flattening the File
)
Let’s write a method to set the
$flatten
attribute to
flatten
. This value is used by the
generate()
method:
让我们写来设置的方法
$flatten
属性
flatten
。 此值由
generate()
方法使用:
// ...
public function flatten()
$this->flatten = ' flatten';
return $this;
Now that we’re able to create an FDF file, we can fill the form using the
fill_form
command:
现在我们已经可以创建FDF文件了,我们可以使用
fill_form
命令填写表单:
// ...
private function generate()
$fdf = $this->makeFdf($this->data);
$this->output = $this->tmpfile();
exec("pdftk {$this->pdfurl} fill_form {$fdf} output {$this->output}{$this->flatten}");
unlink($fdf);
generate()
calls the makeFdf()
method to generate the FDF file, then it runs the fill_form
command to merge it with the raw PDF form. Finally, it will save the output to a temporary file which is created with the tempfile()
method.
generate()
调用makeFdf()
方法生成FDF文件,然后运行fill_form
命令将其与原始PDF表单合并。 最后,它将输出保存到使用tempfile()
方法创建的临时文件中。
保存文件 (Saving the File)
When the file is generated, we might want to save or download it, or do both at the same time.
生成文件后,我们可能要保存或下载它,或者同时执行这两个操作。
First, let’s create the save method:
首先,让我们创建保存方法:
// ...
public function save($path = null)
if (is_null($path)) {
return $this;
if (!$this->output) {
$this->generate();
$dest = pathinfo($path, PATHINFO_DIRNAME);
if (!file_exists($dest)) {
mkdir($dest, 0775, true);
copy($this->output, $path);
unlink($this->output);
$this->output = $path;
return $this;
The method first checks if there’s any path given for the destination. If the destination path is null, it just returns without saving the file, otherwise it will proceed to the next part.
该方法首先检查是否为目标指定了任何路径。 如果目标路径为null,则仅返回而不保存文件,否则将继续进行下一部分。
Next, it checks if the file has been already generated; if not, it will call the generate()
method to generate it.
接下来,它检查文件是否已经生成; 如果没有,它将调用generate()
方法生成它。
After making sure the output file is generated, it checks if the destination path exists on the disk. If the path doesn’t exist, it will create the directories and set the proper permissions.
确保生成输出文件后,它将检查磁盘上是否存在目标路径。 如果路径不存在,它将创建目录并设置适当的权限。
In the end, it copies the file (from the tmp
directory) to a permanent location, and updates the value of $this->output
to the permanent path.
最后,它将文件(从tmp
目录)复制到永久位置,并将$this->output
的值更新为永久路径。
强制下载文件 (Force Download the File)
To force download the file, we need to send the file’s content along with the required headers to the output buffer.
要强制下载文件,我们需要将文件的内容以及所需的标头发送到输出缓冲区。
// ...
public function download()
if (!$this->output) {
$this->generate();
$filepath = $this->output;
if (file_exists($filepath)) {
header('Content-Description: File Transfer');
header('Content-Type: application/pdf');
header('Content-Disposition: attachment; filename=' . uniqid(gethostname()) . '.pdf');
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Length: ' . filesize($filepath));
readfile($filepath);
exit;
In this method, first we need to check if the file has been generated, because we might need to download the file without saving it. After making sure that everything is set, we can send the file’s content to the output buffer using PHP’s readfile()
function.
在这种方法中,首先我们需要检查文件是否已生成,因为我们可能需要下载文件而不保存文件。 确保一切都设置好之后,我们可以使用PHP的readfile()
函数将文件的内容发送到输出缓冲区。
Our PdfForm class is ready to use now. The full code is on GitHub.
我们的PdfForm类现在可以使用了。 完整代码在GitHub上 。
将课堂付诸实践 (Putting the Class into Action)
require 'PdfForm.php';
$data = [
'first_name' => 'John',
'last_name' => 'Smith',
'occupation' => 'Teacher',
'age' => '45',
'gender' => 'male'
$pdf = new PdfForm('form.pdf', $data);
$pdf->flatten()
->save('output.pdf')
->download();
创建一个FDF文件 (Creating an FDF File)
If we just need to create an FDF file without filling out a form, we can use the makeFdf()
method.
如果只需要创建FDF文件而不填写表单,则可以使用makeFdf()
方法。
require 'PdfForm.php';
$data = [
'first_name' => 'John',
'last_name' => 'Smith',
'occupation' => 'Teacher',
'age' => '45',
'gender' => 'male'
$pdf = new PdfForm('form.pdf', $data);
$fdf = $pdf->makeFdf();
The return value of makeFdf()
is the path to the generated FDF file in the tmp
directory. We can either get the contents of the file or save it to a permanent location.
makeFdf()
的返回值是tmp
目录中生成的FDF文件的路径。 我们可以获取文件的内容或将其保存到永久位置。
If we just need to see which fields and field types exist in the form, we can call the fields()
method:
如果仅需要查看表单中存在哪些字段和字段类型,则可以调用fields()
方法:
require 'PdfForm.php';
$fields = new PdfForm('form.pdf')->fields();
echo $fields;
If there’s no need to parse the output, we can pass
true
to the
fields()
method, to get a human readable output:
如果不需要解析输出,则可以将
true
传递给
fields()
方法,以获取易于理解的输出:
require 'PdfForm.php';
$pdf = new PdfForm('pdf-test.pdf')->fields(true);
echo $pdf;
结语
(
Wrapping Up
)
We installed
PDFtk
and learned some of its useful commands like
dump_data_fields
and
fill_form
. Then, we created a basic class around it, to show how we can bring PDFtk’s power to our PHP applications.
我们安装了
PDFtk
并了解了一些有用的命令,例如
dump_data_fields
和
fill_form
。 然后,我们围绕它创建了一个基本类,以说明如何将PDFtk的功能引入我们PHP应用程序。
Please note that this implementation is basic and we tried to keep things as bare bones as possible. We can go further and put the FDF creation feature in a separate class, which would give us more room when working with FDF files. For example, we could apply chained filters to each form data entry like uppercase, lowercase or even format a date, just to name a few. We could also implement
download()
and
save()
methods for the FDF class.
请注意,此实现是基本的,我们尝试使事物尽可能裸露。 我们可以进一步将FDF创建功能放在单独的类中,这将在处理FDF文件时为我们提供更多空间。 例如,我们可以将链式过滤器应用于每个表单数据条目,例如大写,小写,甚至格式化日期,仅举几例。 我们还可以为FDF类实现
download()
和
save()
方法。
Questions? Comments? Leave them below and we’ll do our best to reply in a timely manner!
有什么问题吗 注释? 将它们放在下面,我们将尽力及时答复!
翻译自:
https://www.sitepoint.com/filling-pdf-forms-pdftk-php/
pdftk 图像转pdf
pdftk 图像转pdfPDF files are one of the most common ways of sharing documents online. Whether we need to pass our clients’ documents to third-party service providers like banks or insurance companies, or...
PHP
Excel[1]已经废弃,现在用
Php
Spreadsheet[2],官方文档。
Php
Spreadsheet是一个用纯
PHP
写的类库,可以读写操作不同的电子表格文件格式,像Excel和LibreOffice。
支持的文档格式:
例:将 test1.
pdf
,test2.
pdf
,test3.
pdf
合并为 一个文件 out.
pdf
pdf
tk
test1.
pdf
test2.
pdf
test3.
pdf
cat output out.
pdf
pdf
tk
A=test1.
pdf
B=test2.
pdf
C=test3.
pdf
cat A B C output out.
pdf
加密
pdf
加访问密码...
当涉及到
Java
Script 高级程序设计的
PDF
设计时,有几个关键的方面需要考虑:
1.
PDF
文档的结构和布局:在设计
PDF
文档时,需要考虑文档的结构和布局。这意味着需要确定页面的大小、方向和边距,以及文本、
图像
和其他元素的位置和大小。
2.
Java
Script 的使用:
Java
Script 可以用于许多不同的用途,包括验证
表单
数据、处理用户输入、生成动态内容等。对于
PDF
设计来说,最常见的用途是通过
Java
Script 创建交互式
表单
,例如添加字段、控制
表单
行为、验证数据等。
3.
PDF
的生成和输出:生成
PDF
文档需要使用适当的软件和工具。在选择工具时,需要考虑功能、易用性和成本等因素。一些流行的
PDF
生成工具包括 Adobe Acrobat、
PDF
tk
和 iText。
总体来说,设计一个高级的
Java
Script
PDF
应用需要深入了解
PDF
的结构和
Java
Script 的用法,并使用合适的工具来生成和输出
PDF
文档。