The .lot / .lot.html universal file format
A universal single file format for sharing anything to anyone anywhere.
What's this lot for?
Have you ever needed to send someone a file with context describing what's in it? Perhaps you're at work and need to send a client a report with a spreadsheet attachment or a printable technical publication with supporting material. Maybe you just want to send someone a universally standard document they can print, but you also need it to be readable on a smartphone.
The .lot format, initially .lot.html for backward compatibility, is designed to cover all of those scenarios to create a universally compatible file format suitable for display, printing, and arbitrary attachments.
Send the lot over
The long-term goal is to make the sentence "send the lot over" easy for all involved. When everything is in a single file, you can just send the lot over to someone. It can include the document itself in a format designed for screen-first viewing, a printable variant if needed, and any supporting files. Under the hood, a .lot file is what's known as a polyglot file that consists of a valid standalone .html, .pdf document, and .zip archive file simultaneously as described in the .lot technical specification section. In general, you can think of it as just a container to hold everything you might visualize putting in a document in a way that's easy to use regardless of your OS or platform.
A documented history
Human to human communication has ranged from hand-delivered cuneiform complaint tablets to modern mobile missives without even the perceived need for standard punctuation, but the printed page has likely had the most reach over time. However, the world has shifted from paper-first to screen-first. Traditional desktop publishing and sending pre-formatted .pdf files primarily made for printing needn't be the continuing norm.
Some conventions have persisted despite being neigh irrelevant to most of our lives. Were you really going to print that throwaway vendor whitepaper .pdf attached to the email you received after attending a conference? Unless mandated by work requirements, the answer is likely no. Instead, it was probably a pain trying to read it on your phone. Yet many of us continue to use traditional desktop publishing paradigms and design for a print-first world that no longer exists.
Then there's data types that aren't represented well on a single piece of paper, such as large spreadsheets. Most fields work around this by sending a printable document alongside a separate spreadsheet attachment. We've lived with this arrangement because there wasn't anything better. The .lot paradigm is better and opinionated about it too, and not just for the sake of saving trees.
Shared documents serve a purpose. They tend to be a more formal way to convey information, such as a report or an important communication. Still, even formal documents benefit from the flexibility we've enjoyed from our favorite apps that reflow text depending on screen size and orientation, such as this document you're reading now if you're viewing it as a responsive webpage. From dark mode to easy access to a table of contents, the modern digital experience is naturally more flexible to user needs than printed paper. Still, some realms will need physical signature pages and legally mandated formats, so those still have to be supported too.
How can I create a .lot file?
Right now, the process is manual. To put it bluntly, it's pretty rough at the moment and needs automated. It involves three components:
- A standalone HTML file
- An optional PDF file
- Any additional files included in a ZIP file
The files are then combined to a single .lot or .lot.html file. One of the goals of the .lot file format is to make the above process seamless and incorporated in various applications that deal with documents. Saying "a standalone HTML file" is doing a lot of heavy lifting but the file you're reading now was created using Obsidian along with the Obsidian Webpage HTML Export plugin. The .lot file format may eventually include specifications defining how a standalone HTML file format should be ideally structured, but for now the specification assumes the HTML component will be generated in a similar standalone fashion.
The PDF portion may not be able to use object streams due to how they work and the tooling should eventually handle that gracefully but for now any .pdf files should be converted as described below.
Fortunately, the ZIP portion can contain any arbitrary files. This can include original desktop formats, more complex printable formats, application specific binary files, or anything else.
How can I use a .lot file?
Using a .lot file is straightforward - just name or rename the extension and open it. By default for now, it makes sense for the file to initially be created with a .lot.html extension so it can initially be opened by a web browser by double-clicking on it. Renaming the file to .zip allows opening the file as a .zip file (try it with this file to retrieve the original Markdown used to generate this document). Renaming the file as .pdf allows printing it. Specifically, if you're viewing this file in a web browser, you can right-click on the page and select Save page as... to save it as each of the file formats.
In the future, the goal would be for the .lot file format to be a "first-class" well-known file format (with no disrespect intended to the LaTeX List of Tables file extension, the only other known common use of .lot files). It should be possible for various operating systems to directly open .lot files with the option to display them in each of the contexts within the .lot. For instance, right-clicking on a .lot file would offer to open it as an HTML document in a web browser, as a PDF document in the system default PDF viewer (which might also be in a web browser), or as a ZIP archive in the system archive tool.
Are .lot files safe?
A .lot file is as safe as a .zip file and no more. A .lot file is as safe as a .pdf file and no more. A .lot file is as safe as an .html file and no more. The intent is for the file to be a document to be consumed by other applications and not directly executable in any way but there could always be flaws or vulnerabilities in either standards or how standard files are parsed which are beyond the scope of what the .lot file specification can address. This admittedly somewhat dodges the question, but ultimately the answer boils down to ".lot is as safe as .html, .pdf, and .zip files are".
Whose fault is it we now have 1415 standards?
This document describes a proposal initiated by Allan Cecil (known online as dwangoAC, keeper of TASBot) to create a universal file format.
You can blame him for this disaster in the making. Or panacea. Or something. Making universal standards is messy. Just ask the USB Consortium.
The .lot technical specification
This section describes technical details around how a .lot file is organized and steps to create a proposed .lot / .lot.html file manually. This section needs significant expansion and will require updates as the specification is defined. For the moment, this mostly just describes a very manual method of creating a .lot file until the process is automated.
Manual .lot file creation
The section assumes you've prepared the following prerequisites:
- Create a Document.html file (that itself isn't a polyglot already, such as exporting a document using the Obsidian Webpage HTML Export plugin to a local file)
- Create a Document.pdf file (that itself isn't a polyglot already and doesn't contain object stream data; if necessary, use
qpdf --object-streams=disableto remove it) - Create a Document.zip file (that itself isn't a polyglot already)
Once all files are prepared in the same directory, use truepolyglot to combine the .html and .pdf files as a PDF with pdfany --payload1file Document.html --pdffile Document.pdf pdf.html. Manually edit pdf.html with a text editor to the output to move the PDF header into a comment; the file will start out looking like this:
%PDF-1.7
%âãÏÓ
1 0 obj
<<
/Filter /FlateDecode
/Length 4656576
>>
stream
<!DOCTYPE html>
<html lang="en"><head>
Relocate everything in the PDF intro portion into an HTML comment block so it looks like this:
<!DOCTYPE html><!--%PDF-1.7
%âãÏÓ
1 0 obj
<<
/Filter /FlateDecode
/Length 4656576
>>
stream
-->
<html lang="en"><head>
In the same file, locate </body></html> - in my file, that section looks like this:
</script></div></div></div></div></body></html>
endstream
endobj
2 0 obj
<<
/Type /Pages
Insert an HTML comment before the </body></html> portion to encapsulate the remainder of the file so it looks something like this on that line:
</script></div></div></div></div><!--
Save the resulting file as masked.html. Next, combine the new file with the zip portion using truepolyglot zipany --payload1file masked.html --zipfile Document.zip polylite.zip. Finally, add--></body></html> as a new line at the end of polylite.zip and save it as the final file, perhaps polyg.lot.html.