Learn how document importing functions in DataSnipper, particularly in co-authoring scenarios. Gain clarity on document behaviors, limitations, and best practices to understand and troubleshoot with confidence.
Key Terms
-
Workbook: An Excel workbook (.xlsx) where users work with DataSnipper.
-
Original Document: The file before PDF conversion (e.g., .docx, .html).
-
Original PDF Document: The PDF version of the original file used as the source of truth.
-
Imported Document: The document shown in the DataSnipper Document Viewer (can be referenced or embedded).
-
Embedded Document: A PDF document stored within the workbook via CustomXML.
-
Referenced Document: A link to an external PDF file that is not saved inside the workbook.
-
Co-Authoring: A Microsoft Excel feature allowing multiple users to open and edit the same workbook simultaneously with AutoSave enabled. Presence of collaborators is shown in the top-left corner of the Excel interface.
Referenced Documents
By default, when a user imports a document in DataSnipper, it is added as a reference to a file located on the user’s local machine or a shared network drive.
Scenario:
-
User A imports a PDF from a shared drive.
-
User B opens the same workbook. If they have access to the same file path, they will see the document.
-
If User B does not have access, the document will appear missing.
Recommendation: Always import documents from shared, network-accessible locations to ensure visibility across all collaborators.
Embedded Documents (Using CustomXML) - Default Behavior
When embedding is enabled, the imported PDF is saved directly inside the Excel workbook via CustomXML. This allows other users to view the document even without access to the original file path.
Limitations of Embedded Documents in Co-Authoring
-
CustomXML is User-Specific
Each user in a co-authoring session maintains their own local CustomXML. These versions do not sync in real time. -
Save Behavior Is Non-Deterministic
Only the last user to close the workbook writes their CustomXML data into the file. All other users’ embedded data is discarded. This means the version of the embedded documents in the final file is determined by who closes the file last.
Example Problem Scenario:
-
User A embeds a document. It is saved to their CustomXML.
-
User B joins the co-authoring session. Since CustomXML is not shared, User B cannot see the document.
-
If User B closes the workbook after User A, their version is saved, and User A’s embedded document is lost.
Fallback Mechanism
To mitigate potential data loss, DataSnipper includes a fallback mechanism:
-
If a co-author cannot see a document, DataSnipper attempts to re-import the PDF from its original location into the co-author’s CustomXML.
-
This fallback works only if the co-author has access to the original file path.
Important Note: Since each user maintains a separate embedded copy, the final embedded document is the version saved by the last person to close the workbook.
Best Practice: Even when using embedding, import documents from shared, accessible locations to allow fallback re-imports and ensure consistent visibility.
Recommendation: Disable Embedding in Co-Authoring
Due to the limitations imposed by Microsoft Excel’s handling of CustomXML:
-
Co-authoring does not support real-time synchronization of embedded content.
-
Embedded documents are not reliably retained unless only one person works on the file or users coordinate the closing order precisely.
-
The fallback mechanism is limited to users with access to the original file paths.
Recommended Setting: Disable embedding when working in co-authoring mode to reduce the risk of confusion and data loss.