Tuesday, October 25, 2022 at 5:09 PM
The Hypertext HTML Document Editor
World Wide Web: Phase 2
Hypertext is a new rich-text editor for creating documents using HTML instead of using an 18 year old text format or complex word processor files. It's an app created using web technologies for creating actual documents, not websites, web pages or web apps.
In the second decade of the 21st century, there is now a full-featured, powerful web browser on literally every device with a screen. These browsers - the result of countless millions of man hours - are more than capable of being used as editors for HTML. In fact, most writing is moving online, using web-based word processor like Google Docs or Microsoft Office 365, blog creation sites like WordPress, Substack or Medium or even services like Slack or Notion.
What if you just want to write a document on your computer, using basic rich-text formatting like bold, italic, headings, etc. ? The choices for for documentation, a report, an academic paper or something similar aren't great. You can use a bloated desktop word processor like Microsoft Word or Pages on your Mac, built-in text editors like Word Pad or TextEdit which are nearly useless, or use a text editor to create plain text file with awkward, non-standard markup for formatting.
The question is, why can't we just use HTML for documents - as it was originally designed?
Phase 2 -- Target: 6 months from start
In this important phase, we aim to allow the creation of new links and new material by readers. At this stage, authorship becomes universal.
- Sir Tim Berners-Lee, 1990
WorldWideWeb: Proposal for a HyperText Project
The problem is that HTML has become a read-only format, despite being more than capable of being both a viewer and editor, and has for years. There's no real reason for this besides simple organization and standardization. The problem is that HTML can now do so much, that any attempts to create a consumer-focused app to edit it soon get unfocused and unusable. Inevitably they end up as full-featured "page design" apps, bogged down with so many features and functionality it becomes unwieldy and unused.
For the past decade or so, web browsers have been focused on incorporating more and more interactive functionality to becoming a platform for powerful JavaScript powered applications, and its original vision of being a platform for anyone to create content which could be both read and updated by anyone. In fact, some of the earliest browsers like Netscape Navigator had an editor built in. Today, creating a page in HTML is done by developers, and using a browser to create content is limited.
Project Fugu is an effort to add in the functionality needed to create full-fledged web powered desktop applications which are indistinguishable from native apps. Hypertext is taking advantage of some of these recently launched features to create the first HTML Document editor.
HTML Documents, as the name implies, are rich-text documents using web standards. Hypertext is both a web app for creating and editing these documents, as well as a proposal for a new .htmd document format. The vision is to create a specification for a standard, safe, guaranteed-editable document using a strict-subset of HTML and CSS which replaces the variety of Markdown, AsciiText, Office file formats, etc. currently in use. The .htmd format is self-contained, non-proprietary and includes the necessary Content Security Policy header to make sure it doesn't contain JavaScript. The latter keeping it focused on just content, with security and privacy built in. By using just semantic HTML tags, the documents will be guaranteed to be able to be read and edited, even if all CSS styles are removed. Finally, the format will have a separate MIME type and standard .htmd extension so that browsers can modify their parser and security rules accordingly (like .hmt and .xhtml today) as well as add in the option for basic, built-in editing.
The vision is that by finally defining a standard format, in the future, very browser app will enable their users to be able to both view and edit the document, finally fulfilling Sir TBL's second phase for the World Wide Web he proposed in 1990.
Rich text without distractions
Rich text, such as bold and italic, among other examples, shouldn't be optional or considered extraneous to language. The fact that computing technology has gone so long ignoring these essential parts of communication is bewildering. You can send a text message from your phone including a variety of customizable emojis in various skin tones, but basic text formats used for literally hundreds of years are either impossible to enter, or lost in transmission. There's more than subtle a difference between, "You really should do something," and "You really should do something". Having to write out ideas using plain text with weird symbols such as _ this _ or * this * is truly a loss, and in the 21st century, completely inexcusable.
One of the main complaints about standard word processing apps and doc formats is the distractions and compatibility problems that are caused by having too many visual design options available to writers and readers. Sending an highly customized or newer .docx file to someone with a different version of Word is always a disaster. Additionally, the proprietary nature of these apps result in files that are either inaccessibly stored online, or nearly impossible to version using git, maintain by others or used in post-processing, making them unusable for documentation or other collaborative writing projects.
Using a standard set of semantic HTML and CSS as the underlying markup for documents solves these problems. The class-based styling means the document's visual design will be more organized and maintained, and anyone can just turn off or remove the custom styling and still see the basic rich-text underneath (as bold, italic, headings, tables, bullets, etc. will still be visible). The files themselves are just text, just like the rest of the other text formatting options out there. But instead of being a proprietary mess, .htmd documents use a universally supported markup language familiar to literally billions of people.
Open source and happy for input
If you agree, help make this a reality. The code to Hypertext is up on Github with an MIT license. It's just an prototype of what is possible. But the goal isn't to fork this project, but to help create a new document standard. The objective is to lock down the .htmd format as well as the capabilities in a standard HTML Document Editor so that Google, Mozilla and Apple and the folks at the W3C can just adopt it wholesale and include the functionality in future browsers.
In a few years every browser should be able to view and edit HTML Documents, then they can start to optimize the browser engines so that composing documents becomes fast and reliable, every forum, blog or web-based app where people write text can start embedding HTML Document formatting sections in their pages, maybe with a <textarea rich="true"> style tag or similar. Imagine how great it would be for anyone from children to office workers can create a "web page" without signing up to some online service or needing a bloated app to do it.
And just think how happy TBL will be to finally have Phase 2 completed after 30 years.
-Russ