ARTICLE AD BOX
I am working with digitally signed PDFs using Apache PDFBox, and I am trying to design a revision history mechanism for documents where signatures may be removed in later revisions.
Context
A PDF can go through multiple incremental updates (revisions), for example:
Revision 1 → Document is created
Revision 2 → Signed by User A
Revision 3 → Signature field is removed (or signature invalidated)
Problem
After a signature is removed in a later revision:
There is no built-in way in the PDF specification to determine who removed the signature
In some cases, the removal may leave no visible trace at all (especially if it's the last revision)
The PDF only reflects the current state + incremental updates, not explicit "actions" like who deleted what
Questions
Best Practice (Core Question) What is the recommended approach for maintaining a revision history/audit trail when signatures are removed from a PDF?
Auditability Since PDF itself does not track "who removed a signature":
Is it standard practice to maintain this information externally (e.g., database logs)?
Or should we embed custom metadata inside the PDF (e.g., XMP, custom dictionary, or annotation)?
User-Friendly Display If we want to show something like:
Revision 3: Signature removed by John
or
Revision 3: Signature removed by System
Is this purely an application-level construct, rather than something derived from the PDF itself?
System Actions If a backend system removes a signature:
What is the recommended way to represent this in audit logs or revision history?
Should it be labeled explicitly (e.g., "Removed by System"), and how is this typically implemented?
Implementation with PDFBox Using PDFBox:
Is there a recommended way to:
Detect that a signature field existed in a previous revision but is missing in the current one?
Compare revisions to infer removal?
Or is it necessary to track all changes outside the PDF?
Actual PDF of the info above can be found here
