Class COSDocument

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, COSObjectable

    public class COSDocument
    extends COSBase
    implements java.io.Closeable
    This is the in-memory representation of the PDF document. You need to call close() on this object when you are done using it!!
    • Field Detail

      • LOG

        private static final org.apache.commons.logging.Log LOG
        Log instance.
      • version

        private float version
      • objectPool

        private final java.util.Map<COSObjectKey,​COSObject> objectPool
        Maps ObjectKeys to a COSObject. Note that references to these objects are also stored in COSDictionary objects that map a name to a specific object.
      • xrefTable

        private final java.util.Map<COSObjectKey,​java.lang.Long> xrefTable
        Maps object and generation id to object byte offsets.
      • streams

        private final java.util.List<COSStream> streams
        List containing all streams which are created when creating a new pdf.
      • trailer

        private COSDictionary trailer
        Document trailer dictionary.
      • isDecrypted

        private boolean isDecrypted
        Signal that document is already decrypted.
      • startXref

        private long startXref
      • closed

        private boolean closed
      • isXRefStream

        private boolean isXRefStream
      • hasHybridXRef

        private boolean hasHybridXRef
      • highestXRefObjectNumber

        private long highestXRefObjectNumber
        Used for incremental saving, to avoid XRef object numbers from being reused.
    • Constructor Detail

      • COSDocument

        public COSDocument()
        Constructor. Uses main memory to buffer PDF streams.
      • COSDocument

        public COSDocument​(ICOSParser parser)
        Constructor. Uses main memory to buffer PDF streams.
        Parameters:
        parser - Parser to be used to parse the document on demand
      • COSDocument

        public COSDocument​(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)
        Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.
        Parameters:
        streamCacheCreateFunction - a function to create an instance of a stream cache
      • COSDocument

        public COSDocument​(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction,
                           ICOSParser parser)
        Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.
        Parameters:
        streamCacheCreateFunction - a function to create an instance of a stream cache
        parser - Parser to be used to parse the document on demand
    • Method Detail

      • createCOSStream

        public COSStream createCOSStream()
        Creates a new COSStream using the current configuration for scratch files.
        Returns:
        the new COSStream
      • createCOSStream

        public COSStream createCOSStream​(COSDictionary dictionary,
                                         long startPosition,
                                         long streamLength)
                                  throws java.io.IOException
        Creates a new COSStream using the current configuration for scratch files. Not for public use. Only COSParser should call this method.
        Parameters:
        dictionary - the corresponding dictionary
        startPosition - the start position within the source
        streamLength - the stream length
        Returns:
        the new COSStream
        Throws:
        java.io.IOException - if the random access view can't be read
      • getLinearizedDictionary

        public COSDictionary getLinearizedDictionary()
        Get the dictionary containing the linearization information if the pdf is linearized.
        Returns:
        the dictionary containing the linearization information
      • getObjectsByType

        public java.util.List<COSObject> getObjectsByType​(COSName type)
        This will get all dictionaries objects by type.
        Parameters:
        type - The type of the object.
        Returns:
        This will return all objects with the specified type.
      • getObjectsByType

        public java.util.List<COSObject> getObjectsByType​(COSName type1,
                                                          COSName type2)
        This will get all dictionaries objects by type.
        Parameters:
        type1 - The first possible type of the object, mandatory.
        type2 - The second possible type of the object, usually an abbreviation, optional.
        Returns:
        This will return all objects with the specified type(s).
      • setVersion

        public void setVersion​(float versionValue)
        This will set the header version of this PDF document.
        Parameters:
        versionValue - The version of the PDF document.
      • getVersion

        public float getVersion()
        This will get the version extracted from the header of this PDF document.
        Returns:
        The header version.
      • setDecrypted

        public void setDecrypted()
        Signals that the document is decrypted completely.
      • isDecrypted

        public boolean isDecrypted()
        Indicates if a encrypted pdf is already decrypted after parsing.
        Returns:
        true indicates that the pdf is decrypted.
      • isEncrypted

        public boolean isEncrypted()
        This will tell if this is an encrypted document.
        Returns:
        true If this document is encrypted.
      • getEncryptionDictionary

        public COSDictionary getEncryptionDictionary()
        This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.
        Returns:
        The encryption dictionary.
      • setEncryptionDictionary

        public void setEncryptionDictionary​(COSDictionary encDictionary)
        This will set the encryption dictionary, this should only be called when encrypting the document.
        Parameters:
        encDictionary - The encryption dictionary.
      • getDocumentID

        public COSArray getDocumentID()
        This will get the document ID.
        Returns:
        The document id.
      • setDocumentID

        public void setDocumentID​(COSArray id)
        This will set the document ID.
        Parameters:
        id - The document id.
      • getTrailer

        public COSDictionary getTrailer()
        This will get the document trailer.
        Returns:
        the document trailer dict
      • setTrailer

        public void setTrailer​(COSDictionary newTrailer)
        // MIT added, maybe this should not be supported as trailer is a persistence construct. This will set the document trailer.
        Parameters:
        newTrailer - the document trailer dictionary
      • getHighestXRefObjectNumber

        public long getHighestXRefObjectNumber()
        Internal PDFBox use only. Get the object number of the highest XRef stream. This is needed to avoid reusing such a number in incremental saving.
        Returns:
        The object number of the highest XRef stream, or 0 if there was no XRef stream.
      • setHighestXRefObjectNumber

        public void setHighestXRefObjectNumber​(long highestXRefObjectNumber)
        Internal PDFBox use only. Sets the object number of the highest XRef stream. This is needed to avoid reusing such a number in incremental saving.
        Parameters:
        highestXRefObjectNumber - The object number of the highest XRef stream.
      • accept

        public void accept​(ICOSVisitor visitor)
                    throws java.io.IOException
        visitor pattern double dispatch method.
        Specified by:
        accept in class COSBase
        Parameters:
        visitor - The object to notify when visiting this object.
        Throws:
        java.io.IOException - If an error occurs while visiting this object.
      • close

        public void close()
                   throws java.io.IOException
        This will close all storage and delete the tmp files.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Throws:
        java.io.IOException - If there is an error close resources.
      • isClosed

        public boolean isClosed()
        Returns true if this document has been closed.
        Returns:
        true if the document is already closed, false otherwise
      • getObjectFromPool

        public COSObject getObjectFromPool​(COSObjectKey key)
        This will get an object from the pool.
        Parameters:
        key - The object key.
        Returns:
        The object in the pool or a new one if it has not been parsed yet.
      • addXRefTable

        public void addXRefTable​(java.util.Map<COSObjectKey,​java.lang.Long> xrefTableValues)
        Populate XRef HashMap with given values. Each entry maps ObjectKeys to byte offsets in the file.
        Parameters:
        xrefTableValues - xref table entries to be added
      • getXrefTable

        public java.util.Map<COSObjectKey,​java.lang.Long> getXrefTable()
        Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.
        Returns:
        mapping of ObjectsKeys to byte offsets
      • setStartXref

        public void setStartXref​(long startXrefValue)
        This method set the startxref value of the document. This will only be needed for incremental updates.
        Parameters:
        startXrefValue - the value for startXref
      • getStartXref

        public long getStartXref()
        Return the startXref Position of the parsed document. This will only be needed for incremental updates.
        Returns:
        a long with the old position of the startxref
      • isXRefStream

        public boolean isXRefStream()
        Determines if the trailer is a XRef stream or not.
        Returns:
        true if the trailer is a XRef stream
      • setIsXRefStream

        public void setIsXRefStream​(boolean isXRefStreamValue)
        Sets isXRefStream to the given value. You need to take care that the version of your PDF is 1.5 or higher.
        Parameters:
        isXRefStreamValue - the new value for isXRefStream
      • hasHybridXRef

        public boolean hasHybridXRef()
        Determines if the pdf has hybrid cross references, both plain tables and streams.
        Returns:
        true if the pdf has hybrid cross references
      • setHasHybridXRef

        public void setHasHybridXRef()
        Marks the pdf as document using hybrid cross references.