I. PDF Overview
PDF (Portable document Format) is a structured document format. It was first published by the United States in the United States in 2014, the typesetting and image processing software Adobe Corporation (Version 1.0). In the same year, it launched its corresponding supporting software product series Adobe Acrobat 1.0; Adobe subsequently revised and upgraded it. , Released version 1.1 in 1994, and launched support software product series Adobe Acrobat2.0 and 2.1. The subsequent PDF version 1.2 was released on November 27, 1996. The corresponding supporting software product series Adobe Acrobat was also upgraded to version 3.0. By the end of 1997, the International Organization for Standardization had begun to consider the acceptance of PDF as an international standard.
1.Comparison between PDF and PS
The PS language (Postscript Language) is also a de facto printing industry standard owned by Adobe Corporation. It can describe beautiful layouts and occupy the dominant position in the current printing field. PDFs have evolved from PS. They have almost the same capabilities and similar methods of description in terms of page descriptions. The PDF uses the same Imaging Model as the PS to represent text and graphics. Like the PS language, the PDF page description instruction draws the page by coloring the selected area. The colored areas can be contours of alphabets, areas defined by lines and curves, and bitmaps. The colored color can be arbitrary, and any graphic on the page can be cropped to other shapes. The page is initially empty, various instructions draw different graphics onto the page, and the new graphic is opaque, which can overwrite the old graphics.
However, PDF is still quite different from PS. This is mainly reflected in the following aspects: 1 PDF files can contain interactive objects such as hyperlinks, interactive forms, and PS, but PS does not. 2PDF is a file structure, and PS is a programming language. Therefore, PDF has higher processing efficiency than PS. The strict structural definition of 3PDF allows the degree of application to randomly access one of the objects, while PS can only perform sequential access to the whole. For example, to access page 100 in a PS file, you must first explain the first 99 pages before you can find page 100, while accessing each page in the PDF is just as fast. The 4PDF also contains font description information such as font size, so that if the font does not exist, the font emulation can be performed (not a simple font substitution) to ensure the consistency of the document display.
2.Features of PDF
The characteristics of PDF can be summarized as follows: 1 Transmissibility. The PDF file supports both the 7-bit ASCII code and the binary code, which can be correctly transmitted in various network environments. 2 support interactive operation. The PDF contains interactive objects such as interactive forms and hyperlinks. 3 support sound, animation. 4 Supports random access to page content, improving the speed of various operations on the page. 5 supports the continuous addition of modifications to facilitate minor changes and increase efficiency. 6 supports a variety of compression encoding methods, the file structure is more compact. 7 font independence. The PDF file may have its own font description information, so that the user can still ensure the correct display of the document if the user system lacks the desired font. 8 platform independence. PDF files have the platform independence of software and hardware. This feature is very suitable for the exchange of information in the network to avoid garbled. 9 security control. PDF files support various levels of security controls. This security control is very important to protect the copyright of electronic publications. We can perform different levels of security settings according to the security requirements of various electronic publications.
Second, PDF principle structure
1.PDF file structure
The PDF file structure (ie, the physical structure) consists of four parts: the file header, the file body, the cross-reference table, and the end of the file. See Figure 1.
The header indicates the version number of the PDF specification to which the file complies. It appears in the first line of the PDF file.
The file body consists of a series of PDF indirect objects.
The cross-reference table is an address index table of an indirect object that can be set up for random access to indirect objects.
The end of the file declares the address of the cross-reference table, which indicates the file's root object (Catalog), and also stores encryption and other security information.
2. PDF document structure
The document structure of PDF is the logical organization structure of the contents of the PDF file. It reflects the hierarchical relationship between indirect objects in the file body. The PDF document structure is a tree structure, as shown in Figure 2. The root node of the tree is also the root object of the PDF file. Below the root node there are four sub-trees: Pages Tree, Outline Tree, ArticleThreads, and NamedDestination.
Among them, in the page tree, all page objects are the leaf nodes of the tree, and they will inherit the attribute values ​​of the parent node as the default values ​​of their corresponding attributes. The bookmark tree organizes the bookmarks according to the hierarchical relationship of the tree level. The bookmarks establish the relationship between the signature of a certain book and the location of a specific page. It allows the user to access the contents of the document according to the signature of the book. The clue tree organizes the article clues and the article bead under the clues according to the structure of the tree. As for the name tree, it establishes a correspondence between the string (name) and the page area. Each leaf node in the tree stores the string and its corresponding page area. The non-leaf node is only an index. To allow the application to quickly access the leaf nodes. The function of the name tree is to let other objects in the PDF file also use a string name to represent a page area.
(to be continued)
Shanghai Liuyuan Trading Co. , Ltd. , https://www.ly-weighing.com