diff options
Diffstat (limited to 'markup/spine-bespoke-output')
| -rw-r--r-- | markup/spine-bespoke-output/html/homepage.index.html | 671 | 
1 files changed, 0 insertions, 671 deletions
diff --git a/markup/spine-bespoke-output/html/homepage.index.html b/markup/spine-bespoke-output/html/homepage.index.html deleted file mode 100644 index 01108df..0000000 --- a/markup/spine-bespoke-output/html/homepage.index.html +++ /dev/null @@ -1,671 +0,0 @@ -<!DOCTYPE html> -<html> -<head> -  <meta http-equiv="Content-Type" content="text/plain; charset=UTF-8" /> -  <title>≅ SiSU project sisudoc.org</title> -  <link href="./css/html_seg.css" rel="stylesheet" /> -</head> - -<body> - -<h1>≅ - SiSU for documents - structuring, publishing in multiple -formats & search</h1> - -<h2>ℹ - A short description</h2> - -<p> - -SiSU is an object-centric, lightweight markup based, document structuring, -parser, publishing and search tool for document collections. It is command line -oriented and generates static content that is currently made searchable at an -object level through an SQL database. -Markup helps define (delineate) objects (primarily various types of text block) -which are tracked in sequence, substantive objects being numbered sequentially -by the program for object citation. - -</p> - -<h2>Δ - SiSU project source</h2> - -<p> -    <a href="./projects"> -        Δ SiSU projects repo (git) -    </a><br> -    - <a href="https://git.sisudoc.org"> -        https://git.sisudoc.org -    </a><br> -</p> - -<p> -    <a href="./projects/sisu"> -        Δ SiSU (scribe): document publishing (multiple formats + search) -    </a><br> -    - <a href="https://git.sisudoc.org/sisu"> -        https://git.sisudoc.org/sisu -    </a><br> -</p> - -<p> -    <a href="./projects/sisu-markup"> -        Δ SiSU markup samples in document pods for sisu (scribe) -    </a><br> -    - <a href="https://git.sisudoc.org/sisu-markup"> -        https://git.sisudoc.org/sisu-markup -    </a><br> -</p> - -<h2>⌘ - SiSU Spine markup sample output</h2> - -<p> -To give an idea of how this works here is a small collection of documents marked -up for and generated by the software. The curation of topics for a collection of -specialized related documents would benefit from a consistently applied bespoke -ontology or thesaurus.<br> The documents presented are documents that have been -released under various creative commons licences, in the public domain, or the -author's work, with the exception of one that is under GPL and the old abandoned -Debian live-manual -</p> - -<p> -    <a href="./authors.html"> -       ⌘ Authors -    </a> -    (software curated from provided document header metadata)<br> -    - <a href="./authors.html"> -        https://sisudoc.org/spine/authors.html -    </a> -</p> - -<p> -    <a href="./topics.html"> -       ⌘ Topics -    </a> -    (software curated from provided document header metadata)<br> -    - <a href="./topics.html"> -        https://sisudoc.org/spine/topics.html -    </a> -</p> - -<h2>፨ - SiSU Spine search</h2> -<p> -    <a href="./spine_search"> -       ፨ Search -    </a> -    (granular search of text objects)<br> -    - <a href="https://sisudoc.org/spine_search"> -        https://sisudoc.org/spine_search -    </a> -</p> - -<div class="p"> -    <!-- SiSU Spine Search --> -    <form action="https://sisudoc.org/spine_search" target="_top" method="POST" accept-charset="UTF-8" id="search"> -    <input type="text" name="sf" size="24" maxlength="255"> -    <input type="hidden" name="db" value="spine.search.db"> -    <input type="hidden" name="sml" value="1000"> -    <input type="hidden" name="ec" value="on"> -    <input type="hidden" name="url" value="on"> -    <button type="submit" form="search"> ㏈ ፨ </button> -    </form> -    <!-- SiSU Spine Search --> -</div> - -<h2>ℹ - SiSU description</h2> - -<p> - -SiSU is an object-centric, lightweight markup based, document structuring, -parser, publishing and search tool for document collections. It is command line -oriented and generates static content that is currently made searchable at an -object level through an SQL database. -Markup helps define (delineate) objects (primarily various types of text block) -which are tracked in sequence, substantive objects being numbered sequentially -by the program for object citation. - -</p> -<p> - -<b>Summary.</b> An object is a unit of text within a document the most common -being a paragraph. Objects include individual headings, paragraphs, tables, -grouped text of various types such as code blocks and within poems, verse. -Objects have properties and attributes, of particular significance are headings -and their levels which provide document structure. A heading is an object with a -heirarchical value, that conceptually contains other objects (such as paragraphs -and possibly sub-headings etc.). Objects are tracked sequentially as they relate -to each other object within a document and substantive objects are numbered -sequentially, for citation purposes. Notably footnotes are not objects in -themselves, rather belonging to the object from which they are referenced, and -following their own numbering sequence. From heading objects (linked) tables of -content may be generated, and if additional metadata is provided book type -indexes can be generated that link back to the objects to which they relate. - -</p> -<p> - -<b>Unpacking this a bit further.</b> SiSU as a concept independent of its markup -language and the parsers that have been implemented, is based on the following -ideas: - -</p> -<p> - -<b>Object-Centricity. On objects:</b> In SiSU objects are the fundamental unit -from which larger constructs within a document and the document itself is built. -Breaking the document into objects provides interesting possibilities. - -</p> -<p> - -<b>Objects are fundamental building blocks:</b> Conceptually within SiSU, -objects are the building blocks or individual units of construction of a -document. Objects are usually blocks of text, the most common of which is the -paragraph, other examples include: individual headings, tables, grouped text of -various types which include code blocks and verse within poems, ... and as -mentioned an object could also, for example, be an image. Objects can be -formatted and placed as needed, providing flexibility and enabling multiple -types of representation across disperate formats and text recepticle, examples -including html, epub, latex (in the past mind-maps) and sql (populated at an -object level, and thereby providing search with that degree of granularity). - -</p> -<p> - -<b>Sequential. Objects have sequence:</b> That objects have sequence, goes -largely without saying, this follows authorship, it is part of the definition of -a document and how a document is written to convey meaning. - -</p> -<p> - -<b>Object Numbers & Citation. Substantive objects are numbered for citation -purposes:</b> Most objects within a document are meant by the author to be a -substantive part of the document. All such objects are numbered sequentially and -can be referenced thereby for citation purposes. -Object numbers provide the possibility of citing/locating text precisely across -different document formats and different languages (assuming the document has -been translated). For search it also makes it possible to identify precisely -where search criteria is met within in each document in the form of an index or -to view those precise text objects before deciding which documents are of -interest. Additionally the use of objects (and that objects are numbered) frees -the possibility to represent the document in the manner considered most suitable -to a specific document format wilst retaining its structural (and citation) -integrity). - -</p> -<p> -<b>Characteristics. Objects have properties and attributes:</b> Objects have -properties (and may have attributes). By properties I here refer to the -fundamental type of object, be it a heading, a paragraph, table, verse etc. -Attributes extend further and may include other things that one might wish to -associate with the object (examples not necessarily currently available/ -implemented in SiSU might include, formatting whether it is indented, or -metadata e.g. the associated language, or programming language for a code block) - -</p> -<p> - -<b>Document structure. Heading objects hold documents structure:</b> Heading -objects hold documents structure through their heading level property. The types -of document of interest to SiSU have structure that is captured by the heading -level property. Headings are individual objects like any other with the -additional properties that (i) they may be regarded as containing the other -objects following them sequentially (until the next heading of a similar or -higher level), heading objects may include other headings (sub-headings), and -(ii) that they have a heirarchy, the root "heading" being the document -title.<br>A complication was intruduced to provide greater flexibility across -document output formats. Headings have two sets of levels, the level under which -substantive text occurs, this would be a chapter or segment level, and above -that in the heirarchy if needed are document section separators, book, section, -part. - -</p> -<p> - -<b>Non-objects</b> Most but not all parts of a document are treated as objects. -Notably footnotes are not objects in themselves, rather belonging to the object -from which they are referenced, and following their own numbering sequence. From -heading objects (linked) tables of content may be generated, and if additional -metadata is provided book type indexes can be generated that link back to the -objects to which they relate. - -</p> -<p> - -<b>The Document Header.</b> SiSU document have headers which contain document -metadata, at a minimum the document title and author. In addition the document -header may contain markup instruction (e.g. how to identify headings within the -document, in which case those headings need not be found and treated -accordingly) - -</p> -<p> - -SiSU parsers have now been implemented in different programming paradigms and -languages a couple of times, the chosen markup has been left unchanged though -the document headers have been modified. - -This is the core of sisu, beyond which there is more but largely in the form of -choices based on ... existing output formats and of implementation detail, -deciding what attributes of objects, or within objects should be supported, -extending markup to allow for the generation of book indexes from if tagging -provided. - -</p> - -<h2>ℹ - SiSU Historical Descriptions</h2> - -<p> -Here is a description that has been used for the original sisu (scribe): -</p> - -<p> -With minimal preparation of a plain-text (UTF-8) file, using sisu markup syntax -in your text editor of choice, SiSU can generate various document formats, most -of which share a common object numbering system for locating content, including -plain text, HTML, XHTML, XML, EPUB, OpenDocument text (ODF:ODT), LaTeX, PDF -files, and populate an SQL database with objects (roughly paragraph-sized -chunks) so searches may be performed and matches returned with that degree of -granularity. Think of being able to finely match text in documents, using common -object numbers, across different output formats (same object identifier for pdf, -epub or html) and across languages if you have translations of the same document -(same object identifier across languages). For search, your criteria is met by -these documents at these locations within each document (equally relevant across -different output formats and languages). To be clear (if obvious) page numbers -provide none of this functionality. Object numbering is particularly suitable -for "published" works (finalized texts as opposed to works that are frequently -changed or updated) for which it provides a fixed means of reference of content. -Document outputs can also share provided semantic meta-data. -</p> - -<h3>...</h3> - -<p> -SiSU is less about document layout than it is about finding a way using little -markup to construct an abstract representation of a document that makes it -possible to produce multiple representations of it which may be rather different -from each other and used for different purposes, whether layout and publishing, -scrollworthy online viewing/ reading, or content search. To be able to take -advantage from its minimal preparation starting point of some of the strengths -of rather different established ways of representing documents for different -purposes, whether for search (relational database, or indexed flat files -generated for that purpose whether of complete documents, or say of files made -up of objects), online or other electronic viewing (e.g. html, xml, epub), or -paper publication (e.g. pdf via latex)... -</p> - -<p> -The solution arrived at is to extract structural information about the document -(document sections and headings within the document, available through pattern -matching or markup) and tracking objects (which primarily are defined units of -text such as paragraphs, headings, tables, verse, etc. but also images) which -can be reconstituted as the same documents with relevant object identification -numbers so text (objects) can be referenced across different output formats and -presentations. -</p> - -<p> -SiSU generates tables of content, and through its markup the means for metadata -to be provided for the generation of book style indexes for a document (that -again due to document object numbers are the same and equally relevant across -all document formats). Per document classifying/organizing metadata can also be -provided for automated document curation. -</p> - -<p> -... there have also been working experiments with sisu markup source, two way -conversion/representation of sisu document markup source in mind-mapping -(software kdissert was used for its strong focus on producing documents (now -apparently called semantik)); also po4a software for translators has been used -successfuly in its regular text mode for sisu markup in translation, (which is -more an attribute of po4a than of sisu, but) which is of interest due to -sisu/spine's object citation numbering being available across translations. Open -Document Format text (odf:odt), has been an output, but much more interesting -(and requested by potential users of sisu/spine) would be the ability of a word -processor to save text/a document in sisu markup, making alternative document -processing and presentations with sisu possible. -</p> - -<p> -also worth mention, in the relatively long history of this project, there has -been work done on extracting hash representations of each object, that could -hypothetically be shared to prove the content of a document without sharing its -content, or of identifying which objects change; these hashes can also be used -as unique identifiers in a database or as identifying filenames if individual -objects are saved. -</p> - -<p> -SiSU has evolved, the current implementation focuses on one primary use-case, -books and literary writings. However the concept on which it is based has wider -application. Here is a prevously posted souvenir from my encounter with an IBM -software evaluator in London June 2004 that came about through a chance -encounter with an IBM manager at a Linux Expo, who was curious about my interest -in Gnu/Linux with my legal background... on hearing that I also wrote software, -he suggested, maybe IBM should have a look at it. I was interested, the meeting -was set up... with an IBM, Software Innovations evaluator<br>His response after -the meeting: -</p> - -<p> -"Ralph<br>Good to meet with you today, I was very impressed with your -software.<br><i>[colleague's name (also posted to an IBM colleague)]</i> - in -summary - Ralph has built an application that runs on linux and takes ASCII -documents and pulls them apart in to the smallest constituent parts, storing -them as XML, PDF and HTML, the HTML are hyperlinked up so the document can be -browsed in its full form. the format and text data created is stored in a -database.<br>This has potential in any place that needs the power of full text -search whilst holding the structural concepts of the document i.e. legal, -pharma, education, research.. which ones we need to figure out, ..." -</p> - -<p> -Special interest was expressed in the search implications of SiSU. To -paraphrase, the company has document management systems dealing with hundreds of -thousands of texts, these tell you which documents match your search criteria, -but cannot inform you where within a text these matches were found without -opening the documents. This is achieved through defining document objects and -making them the building block of the document, trackable document objects (that -can be placed back in the context of the document or corpus of documents if part -of a collection). SiSU's early design was to - abstract documents to their -structure, and identified objects, numbered in a citable way (as pointed out -document object hashes can be of use for the purpose). -</p> - -<h2>ℹ - SiSU Spine</h2> - -<p> -SiSU Spine is the new generator for documents prepared in sisu markup, written -in D as opposed to the original sisu which was first shared in Ruby. -</p> - -<p> -Spine code has not as yet been made publicly available. -</p> - -<p> -As compared with the original sisu generator sisu spine: -</p> - -<p> -- Spine uses the same document markup for the document body, but uses yaml for -document headers (which contains document metadata and configuration details), -the original sisu has a bespoke markup for headers. -</p> - -<p> -- Spine (written in D) is considerably faster at generating native output than -sisu (written in Ruby), on last test at least 60 times faster (what took 1 -minute takes 1 second; 1 hour a minute :-) (admittedly some time ago, ruby has -been getting faster, hopefully this is not over over promising). -</p> - -<p> -- Spine produces fewer document outputs types than sisu (html, epub, (odt, -latex) and populates sql db for search) -</p> - -<p> -- As regards non-native output, so far Spine has greater separation of what it -does and largely leaves calling the external program to the user, e.g.: latex -output is a native output in the sense that it is generated directly by spine, -but the pdfs that can be produced from these are produced through use of an -external program xelatex, which produces fine output but is a very much slower -process. -</p> - -<p> -- (where both produce the same output type, generally) Spine generally produces -more up to date output format representations. -</p> - -<hr> -<p class="tiny"><i> -ralph.amissah www since 1993 ;-) -</i></p> - -<hr> -<h2>Some external links of interest</h2> - -<h3>Development</h3> -<h4>Programming</h4> -<p> -    [ <a href="https://dlang.org/"> -        D - (dlang) general purpose, multi-paradigm, fast C like programming language -    </a> ] -    [ <a href="https://code.dlang.org/"> -        dub - package registry -    </a> ] -    [ <a href="https://forum.dlang.org/group/general"> -        community discussion (mail list frontend) -    </a> ]<br> -</p> -<p> -    [ <a href="https://www.ruby-lang.org/en/"> -        Ruby -    </a> ] -    [ <a href="https://rubygems.org/"> -        Gems -    </a> ]<br> -    [ <a href="https://crystal-lang.org/"> -        Crystal -    </a> ]<br> -</p> -<h4>SQL DB</h4> -<p> -    [ <a href="https://sqlite.org/index.html"> -        Sqlite - an sql database engine -    </a> ]<br> -    [ <a href="https://www.postgresql.org/"> -        PostgreSQL -    </a> ]<br> -</p> -<h4>Markup</h4> -<p> -    [ <a href="https://www.w3.org/html/"> -        HTML -    </a> ] -    [ <a href="https://html.spec.whatwg.org/multipage/"> -        multipage current spec -    </a> ] -    [ <a href="https://dom.spec.whatwg.org/"> -        dom current spec -    </a> ]<br> -    [ <a href="https://www.w3.org/publishing/epub32/"> -        Epub -    </a> ]<br> -    [ <a href="https://www.w3.org/Style/CSS/"> -        css - cascading style sheets -    </a> ]<br> -</p> -<p> -    [ <a href="https://opendocumentformat.org/"> -        OpenDocument Format -    </a> ]<br> -</p> -<p> -    [ <a href="https://www.latex-project.org/get/"> -        LaTeX -    </a> ]<br> -</p> -<p> -    [ <a href="https://po4a.org/index.php.en"> -        po4a - maintain translations -    </a> ]<br> -</p> -<h4>Operating System Distributions</h4> -<p> -    [ <a href="https://nixos.org/"> -        NixOS - linux based operating system built on the Nix declarative, reproducible and reliable, build system -    </a> ] -    [ <a href="https://github.com/NixOS/nixpkgs"> -        nixpkgs (packages @ github) -    </a> ] -    [ <a href="https://search.nixos.org/packages?channel=unstable&from=0&size=100&sort=relevance&query="> -        package search -    </a> ] -    [ <a href="https://discourse.nixos.org/"> -        community discussion (discourse) -    </a> ]<br> -    Gnu [ <a href="https://guix.gnu.org/"> -        Guix -    </a> ] -    [ <a href="https://guix.gnu.org/en/packages/"> -       packages -    </a> ] -    <br> -</p> -<p> -    [ <a href="https://debian.org/"> -        Debian - the universal operating system distribution -    </a> ]<br> -    [ <a href="https://www.devuan.org/"> -        Devuan -    </a> ]<br> -</p> -<p> -    [ <a href="https://archlinux.org/"> -        Arch Linux -    </a> ] -    [ <a href="https://wiki.archlinux.org/"> -        Arch Wiki -    </a> ]<br> -</p> - -<hr> - -<h2>Extraneous (external) links of personal interest</h2> - -<h4>Workspace</h4> - -<h5>Shell</h5> -<p> -    [ <a href="https://www.zsh.org/"> -        zsh -    </a> ]<br> -    [ <a href="https://starship.rs/"> -        starship - customizable cross-shell prompt -    </a> ]<br> -</p> -<h5>Terminal</h5> -<p> -    [ <a href="https://gnunn1.github.io/tilix-web/"> -        tilix -    </a> ] -    [ <a href="https://alacritty.org/"> -        alacritty -    </a> ]<br> -</p> -<h5>Terminal Multiplexer</h5> -<p> -    [ <a href="https://github.com/tmux/tmux"> -        tmux (github) -    </a> ] -    [ <a href="https://www.gnu.org/software/screen/"> -        screen -    </a> ]<br> -</p> -<h5>Window Manager</h5> -<p> -    [ <a href="https://i3wm.org/"> -        i3wm -    </a> ] -    [ <a href="https://swaywm.org/"> -        sway -    </a> ]<br> -</p> -<h5>Text Editors</h5> -<p> -    Gnu Emacs -    [ <a href="https://github.com/hlissner/doom-emacs"> -        Doom Emacs (github) -    </a> ] -    [ <a href="https://orgmode.org/"> -        Org-Mode - your life in plain text & literate programming -    </a> ] -    [ <a href="https://github.com/emacs-evil/evil"> -        Evil-Mode -    </a> ]<br> -</p> -<p> -    [ <a href="https://www.vim.org/"> -        Vim -    </a> ] -    [ <a href="https://neovim.io/"> -        NeoVim -    </a> ]<br> -</p> -<h5>Source Control Manager</h5> -<p> -    [ <a href="https://git-scm.com/"> -        Git -    </a> ]<br> -</p> -<h5>Browsers</h5> -<p> -    [ <a href="https://vieb.dev/"> -        vieb -    </a> ] -    [ <a href="https://fanglingsu.github.io/vimb/"> -        vimb -    </a> ]<br> -    [ <a href="https://brave.com/"> -        brave -    </a> ]<br> -</p> - -<h3>Search</h3> -<p> -    [ <a href="https://duckduckgo.com/"> -        DuckDuckGo -    </a> ] -    [ <a href="https://yubnub.org/"> -        YubNub -    </a> ]<br> -</p> - -<h3>eMail</h3> -<p> -    [ <a href="https://www.migadu.com/"> -       Migadu -    </a> ]<br> -</p> -<p> -    [ <a href="https://notmuchmail.org/"> -       NotmuchMail -    </a> ]<br> -</p> - -<h3>Forges</h3> -<p> -    [ <a href="https://sourcehut.org/"> -        Sourcehut -    </a> ]<br> -</p> -<p> -    [ <a href="https://codeberg.org/"> -        CodeBerg -    </a> ]<br> -</p> -<p> -    [ <a href="https://github.com"> -        GitHub -    </a> ] -    [ <a href="https://gitlab.com"> -        GitLab -    </a> ]<br> -</p> - -<h3>Software Archives</h3> -<p> -    [ <a href="https://www.softwareheritage.org/"> -        Software Heritage - the universal software archive -    </a> ]<br> -</p> - -<hr> -<p class="tiny"><i> -ralph.amissah www since 1993 ;-) -</i></p> - -</body> -</html>  | 
