| Home | Forums | Reviews | Guides | Newsgroups | Register | Search |
![]() |
| Thread Tools |
|
Ralph Amissah
Guest
Posts: n/a
|
20050104 SiSU is released
-------------------------- Announce -------- Excuse the lengthy announcement, hubris and repetition. A fairly big day for me, I have worked on SiSU for several years, though only recently with its imminent release in mind... The focus of SiSU is simple and sparse markup requirements, (used single documents or large documents sets), to produce structured multiformat published text versions, with a common/shared citation system, and search possibilities that take advantage of this. Little time has been spent on the installation procedure. I would appreciate feedback from anyone who installs and tests SiSU on Linux and BSD (and OSX?) platforms. I anticipate there will be problems initially related to installation and setup, which I would be grateful for feedback on and, which I will be pleased to help with. Once past the install I would very much appreciate feedback generally and especially from Rubyists (as the text it is designed to work with is not a code or documentation, interest will not be developer specific, and may be limited), Librarians, Document Projects, and academic writers on aspects of interest. Additional syntax highlighters for SiSU markup would be extremely welcome, they don't need to be as complete as the vim highlighter. Emacs would obviously be nice, of much interest would be the ruby editors, and also less geeky text editors, as it is hoped that SiSU will eventually be used by non-coders. I expect some criticism for hubris, some OT opinions expressed here (and elsewhere), and possibly coding style which has evolved over the years, and which may not always have been consistently updated (also because of the lack of use of spaces, put that down to using an editor with excellent syntax highlighting and what I have come to be accustomed to, as a lone coder). This release will primarily be of interest to developers as the install/setup are hardly documented, (and assumes you have independently installed external programs that are taken advantage of such as Postgresql, have file permissions set and more), it is not tested across platforms. But if you are able to get it working it does do quite a bit. Paradoxically, though for documents it is not for programming documentation, and this will reduce its value to the same developers who might currently be able to use it. I ask much, there is no rush. (This is sadly be a fairly busy month for me, my response time is going to have to be slow.) I have enjoyed working on SiSU very much over a number of years, and am pleased with what it does and how it does them. I hope it is of use to others. Ready or not, here it is, as it (currently) is, enjoy, Download: --------- sisu_0.1.0-9_2005w01-2.tgz http://www.jus.uio.no/sisu/download/..._2005w01-2.tgz SHA1(sisu_0.1.0-9_2005w01-2.tgz)= 14b230ba5a4c8f1c7264b38cd2d9c95a97477f3a Well Wishes all for 2005, Ralph Amissah What is SiSU? ------------- (SiSU - simple, information structuring utility/universe) SiSU is an electronic publishing system and (hybrid) kind of document management system (for the documents that it generates), with its own unique set of features, including amongst many others, very simple markup; writing to the file system (for Internet, Intranet, or file serving, and including eg CD publication) and/or relational database; in multiple output formats (html, structured XML, LaTeX and pdf, postgresql), with a citation system that is common to all output types. SiSU is a (command line) text processing program that produces structured electronic documents from a simple marked up input file (using a markup syntax similar to smart ascii that I claim to be simpler than the most elementary html) in multiple output formats, from html, and structured XML, to pdf via LaTeX, and to streaming into relational databases (currently Postgresql), writing in a structured way to the file system or to a relational database, where it retains information on the documents structure. SiSU may be used either for individual documents or collections/libraries of published (as in finished and not subject to continuous change) documents. The type of documents it handles being primarily law (which can be quite diverse) and literature, some social sciences, (as opposed to maths, science, programming etc.) There are several samples available. Documents are marked up in "SiSU Syntax" in your favourite browser, and SiSU a command line driven batch processor is run against the marked up document(s) to produce the desired output(s). SiSU (once installed and set up) should be easy enough for anyone to use, (with a bit of additional documentation). The markup syntax is simple, and the commands are easy enough with interactive help. It would benefit greatly from additional syntax highlighters. (There are sample input documents from which various outputs can be generated). As a proof of concept the SiSU framework is in place, and many of the modules have been used professionally for several years. There are many more modules than the ones so far released, these have been held back either because they have not been properly maintained, having fallen into disuse, or because they are not generic enough in their current implementation. Information on SiSU is available at: http://www.jus.uio.no/sisu/SiSU/ Sample texts, and remember SiSU is not specifically for books: http://corundum/sisu/sisu/2#h2.1.3 Possibly of greater interest to illustrate how different the possibilities this provides, is search: http://corundum/sisu/sisu/1#h1.14.6 And the markup from which this is derived: http://corundum/sisu/sisu/2#h2.2 SiSU provides ------------- [This is part of a fairly recent attempt to explain certain aspects of the project to a layman.] SiSU provides a way with minimal markup effort to have multiple output formats, taking advantage of some of the their strengths - vis. html, structured XML, pdf via LaTeX, and relational SQL databases, all of which are tied together using a common citation system. * simple markup (done once, makes automatically available the rest),[1] * possibility of adding semantic data to documents (currently the Dublin Core, though it would be easy to incorporate other, or alternative systems) * multiple outputs - using industry standards, and taking advantage of the rather different strong points of each (html, structured XML, pdf via LaTeX, relational SQL database - currently Postgresql, retaining structural information) * a common citation system for all document outputs, including the relational database, searches being able to take taking advantage of the implications of the citation system (primarily the automatic consistent numbering of headings and paragraphs, in such a way that they can be used by and to reference content in all output types). There is a list of features of SiSU listed here: http://www.jus.uio.no/sisu/SiSU/1#h1.2 which I will tag on to the end of this document. The document contains sample input and output files (several places, but also here): http://www.jus.uio.no/sisu/SiSU/2#h2.2 The last thing to be done was a search front-end for the database, which I finally decided to buckle down to doing. The back-end has been in place for a number of years now, but this makes this feature a lot easier to demonstrate. Unfortunately I do not have that online - a link to images in its current form: http://corundum/sisu/sisu/1#h1.14.6 which relates to what IBM for example found to be of particular interest early in the summer of 2004: http://www.jus.uio.no/sisu/SiSU/2004#795 [location may change as this document is updated] Some of those links will change with subsequent modifications to the text, it is best used for published works. There is much to browse generally, some of it is just fan material of other things technical that I have found useful. The document http://www.jus.uio.no/sisu/SiSU/ http://www.jus.uio.no/sisu/SiSU/portrait http://www.jus.uio.no/sisu/SiSU/landscape [1] e.g. marking up War and Peace (from a Gutenberg Project ascii text) is done in a little over an hour. Reduction in the effort required for the preparation of texts (XML for example buzzword of the industry is labour intensive and complicated, LaTeX is also a lot more complicated than SiSU markup syntax - they are more flexible, but do not provide the composite solution... single command building of documents and/or populating of a relational dataabse, while retaining structural information. Platform -------- Unix/Linux. [I have not glanced at other OS's for the purpose of development since 1999.] Developed and tested on Debian/Gnu/Linux Sid. Short summary of features -------------------------- from http://www.jus.uio.no/sisu/SiSU/1#h1.2 (i) minimal markup requirement, (ii) single file marked up for multiple outputs, (iii) markup is simpler than html, (iv) the simple syntax is mnemonic, influenced by mail/messaging/wiki markup practices *(v)* human readable, and easily writable syntax, (vi) multiple outputs include amongst others: "html"; "pdf" via "LaTeX"; (structured) "XML"; sql - currently "PostgreSQL" (and sqlite); "ascii", (also "texinfo"), (vii) takes advantage of the strengths implicit in these very different output types, (e.g. LaTeX (professional document typesetting, easy conversion to pdf or Postscript); XML (in this case, structural representation); sql relational database (e.g. document search; representing constituent parts of documents based on their structure, headings, chapters, paragraphs as required; control of use) important enough to be given a heading of its own.), (viii) provides a common citation system for all outputs, (object citation numbering), all text objects (headings and paragraphs) are numbered identically, for citation purposes, in all outputs ("html", "pdf", sql etc.), (ix) use of Dublin Core and other meta-tags to permit the addition of some semantic information on documents, and making easy integration of rdf/rss feeds etc., (x) creates organised directory/file structure for (file-system) output, (xi) easily mapped with its clearly defined structure, with all text objects numbered, you know in advance where in each document output type, a bit of text will be found (eg. from an sql search, you know where to go to find the prepared "html" output or "pdf" etc.)... there is more, (xii) search of document sets, the relational database retains information on the document structure, and citation numbering makes it possible for example to present search matches as an index of documents and locations within the document where the match is found, (an image series added December 12th 2004 in the Chronology pages, somewhere around http://www.jus.uio.no/sisu/SiSU/2004#781 gives an idea of what is possible, I unfortunately do not have the hardware currently set up to demonstrate this dynamically on the www), (xiii) "word maps" rudimentary index, consisting of all the words in a document and their (text object) locations within the text, (xiv) very easily skinnable, document appearance on a project/site wide, directory wide, or document instance level easily controlled/changed, (xv) easy directory management and document associations, the document preparation (sub-)directory may be used to determine output (sub-)directory, the skin used, and the sql database used, (xvi) in many cases a regular expression may be used (once in the document header) to define all or part of a documents structure obviating or reducing the need to provide structural markup within the document, (xvii) is a batch processor for handling large document sets, ... though once generated they need not be re-generated, unless changes are made to the desired presentation of a particular output type, (xviii) possible to pre-process, which permits the easy creation of standard form documents, and templates/term-sheets, (xix) easy to add, modify, or have alternative syntax rules for input, should you need to, (xx) (future-proofing) extremely modular, (thanks in no small part to Ruby) another output format required, write another module.... , (xxi) (future-proofing) easy to update output formats (eg html, xhtml, latex/pdf produced can be updated in program and run against whole document set), (xxii) scalability, dependent on your file-system (in my case Reiserfs) and on the relational database chosen (currently Postgresql), and your hardware, (xxiii) a framework for adding further capability as required, (xxiv) tied to version control system, only code and marked up file need be backed up, to be sure of the much larger document set, (xxv) document management, (xxvi) use your favourite editor, syntax highlighting files for markup, primarily (g)vim so far. SiSU was developed in relation to legal documents, and so is strong across a wide variety of texts (law, literature...), though weak on formulae/statistics, it does handle images. An assumption has been document sets that are to be preserved and maintained over time (also a result of the legal text origin). SiSU has been developed and used over a number of years, and the requirements to cover a wide range of documents have been thoroughly explored. Standards --------- Outputs are to standard protocols or open source software. I would like to keep SiSU markup and meta-markup a standard, although by the SiSU program design it is easy to modify. I make claim to "object citation numbering" as a very simple idea with which I have persisted for many years, that makes much possible, and is a unifying feature of SiSU output. Generated by SiSU SiSU Sabaki 0.1.0-8 2004w51/4 www.jus.uio.no/sisu/SiSU/ Using: Standard SiSU markup syntax, Standard SiSU meta-markup syntax, and the Standard SiSU object citation numbering and system © Ralph Amissah 1997, current 2005. All Rights Reserved. Separating the markup syntax (human readable, and usually human prepared), and meta-markup syntax (machine written) has interesting possibilities. (i) It is possible to change the markup syntax (or have several alternative input sytaxes) without disturbing the downstream program modules/libraries, provided you write to the same standard meta-markup syntax. (if you used the original syntax and then changed to an alternative syntax, you would presumably have alternative standard meta-markup generators, or convert the original syntax to the alternative syntax). (ii) It is also possible to change the meta-markup syntax, with consequences for all the downstream programs, but without in any way affecting your document set (your marked up documents). Both of which have been very useful over the years of development, and use of SiSU. The object citation numbering system (ocn) is a simple idea, which being relevant to man and machine has far reaching possibilities. All output uses the same object citation numbering, including database searches, which can present matches with an index of documents and the (hyperlinked ocn) locations within each document where the match was found. However, it is of interest to keep both relatively stable, and indeed to have a Standard. I claim this standard (at least the original standard). License ------- (i) GPL 2 or later, for non-commercial use of the program and publications (ii) Distributed under a commercial license everything else, (terms to be determined) that is for everything that is not (i) expanded upon a bit - GPL 2 or later. Or under special license terms from Ralph Amissah. The details of which are to be determined. The idea being that it can be incorporated into proprietary systems, under a proprietary license, for a per seat fee. (SiSU was identified as being of interest as a middle-ware application by a large database and document management software provider...) From this point on there will be a GPL and proprietary branch. I expect if there is any take-up the GPL branch will advance faster and further (in my hands and generally) than the proprietary branch. SiSU is the result of several years of research and development in electronic publishing, commenced in 1993 and under active development since 1997. There is always more to be done. SiSU is released under GPL 2 or later http://www.gnu.org/copyleft/gpl.html (first on January 4th 2005) and is alternatively available under special license terms from Ralph Amissah the detail of which is to be determined. Setup/Installation/Use --------------- To start with see the README file provided with the program. Historical note --------------- SiSU is the result of a several year journey of research and development related to electronic publishing, in particular related to legal and academic writings. It started with the discovery of the Web and a project to publish legal documents on the Web in 1993. Programming started later, but ideas as to what would be useful to have and be able to do, started formed from that initiation. I was lucky enough at the time to work with Geoffrey Armstrong and Tommy Johanson, (who wrote the first lines of Perl I ever saw). Programming SiSU, setting ends and attaining the ends set has been a solo effort, from which I have learnt masses, and come to appreciate and depend on the work of others, no one less so than Matz of Ruby fame. Within the Ruby community I have learnt lots from others, in particular Ruby book authors both paper and electronic (I would guess Dave Thomas, Why (what's new in Ruby 1.8.0, and yes even bits of the Poignant Guide), and Hal Fulton in roughly that order, Slagell's book is decent, I would not have minded starting on Ruby with that), and those most vocal in the newsgroup and irc channel (to many to keep track of let alone mention, - Eek and Batsman and earlier in time DBlack deserve special mention). I have not used, the recommended route of studying the code of other projects (perhaps one day). The Ruby language is remarkable as has been the Ruby community to date. I have not studied other document/text processors as such either. My impression is that this must be much easier to use than say a DocBook, but will offer a different range of features. (I probably should not mention it at all, I don't know). I have always planned to share this work (under a dual license, one of them being the GPL). A brief encounter with IBM in 2004 (Software Innovations evaluation) had me scrambling to the U.S. June/July to arrange a provisional Patent application, (and wondering if that was the route I wished to pursue why I had not done so seven or more years earlier) as the only way to meaningfully talk to them. The employee left, and interest has not persisted, fortunately. As to where I stand on Software Patents, software patents in their current form appear to be primarily a tool to stifle innovation, not to promote it, (indeed this is why what I have done is a lot more interesting to a large company if I hold a Patent than otherwise) that can only be financially afforded by large companies in their application, and in their enforcement through litigation. Europe would do well not to have them. If I were not pleased with Debian/Gnu/Linux(Sid), its' packaging system, (developers and range of applications) and social contract, I would almost certainly use one of the BSDs as my development platform - FreeBSD or Dragonfly. What SiSU is not - SiSU is not ------------------------------ * blogging software. (though i sometimes misuse it in this way) * a wiki (well obviously, though it would be interesting to use this technology alongside a wiki - the wiki being used for constantly updated pages and navigation information, whilst SiSU is used for published works that are not changed frequently - eg a published academic writing, a book, a convention) * for documentation on programming, or mathematical, scientific texts. Todo ---- This is a fairly large project, much remains to be done. Of particular interest, without any time scale or immediate urgency: * Documentation. There is some, but the presentation is nowhere near as digestable as it should be. * Documentation apart, the biggest single todo is Unicode processing. LaTeX and Postgresql support UTF-8 so that is what it is most likely to be. My excuse for not having looked at it yet ... need to date, and not having configured my environment for it. I do however recognise this as a need. * Getting the Sqlite module working again. Similar to the Postgresql module, fell out of maintenance, when I found Sqlite to be a bit of a pain to install on Debian, (and was prioritising Postgresql), once upon a time the modules were in sync, and I hope to have them that way again someday. * Much code cleaning ... this project has developed over several years, and there have been many changes in how things are done, without rigorous removal of dead code. * simplify installation, and test across other Unix and Gnu/Linux platforms. * object citation numbering is currently done only for substantive text and other objects (such as images), a secondary numbering will eventually be implemented for non-substantive items. * decide what to do with images and tables in XML and in relational database. * Marshalled/PStored Metaverse. As an alternative (not replacement) to the current ordinary text based SiSU meta-markup state. * Additional Syntax hi-lighters. The current syntax hi-lighter, and folds are for vim. Additional syntax highlighters for SiSU markup would be extremely welcome, they don't need to be as complete as the vim highligter. Emacs would obviously be nice, but the ruby editors, and less geeky editors are of much interest. Not sure that I will do this, after all I do use Vim, we'll see. * My vim configuration files are a total mess, but are provided as is. Help/suggestions welcome. |
|
|
|
|
|||
|
|||
| Ralph Amissah |
|
|
|
| |
|
Ralph Amissah
Guest
Posts: n/a
|
http://www.jus.uio.no/sisu/SiSU/download
SiSU Sabaki, version 0.1.1-0 of 2005w01/4 (20050106): http://www.jus.uio.no/sisu/download/..._2005w01-4.tgz SHA1(sisu_0.1.1-0_2005w01-4.tgz)= ff5ca4cf45d34ca3a6491e8ba29da30474b47b2b Modifications to installation script, configuration paths, and help only. This release should be easier to install and to figure out what needs to be done should problems be encountered with installation. * install script a bit smarter, - also installs the configuration files that come along with the sample marked up documents, which should make things a bit easier * a lot more information provided on paths, both by the install script and interactive help once installed, to assist in figuring out what needs to be done should a problem arise with installation or configuration * help updated On Tue, 4 Jan 2005 20:33:22 +0000 (UTC), Ralph Amissah <> wrote: > 20050104 SiSU is released > -------------------------- > > Announce > -------- > > Excuse the lengthy announcement, hubris and repetition. > ..... > > Help/suggestions welcome. On SiSU Generally: http://www.jus.uio.no/sisu/SiSU/ http://www.jus.uio.no/sisu/SiSU/1#h1.2 |
|
|
|
|
|||
|
|||
| Ralph Amissah |
|
|
|
| |
![]() |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| 36 years of relational databases | Lawrence D'Oliveiro | NZ Computing | 26 | 07-25-2006 10:55 AM |
| [ANN] SiSU document publishing with common citation system and search | Ralph Amissah | Ruby | 0 | 01-10-2005 06:06 PM |
| Xindice VS Relational Databases | Eric Frigot | XML | 1 | 12-21-2004 02:01 AM |
| Re: Any pure-python relational databases? | David McNab | Python | 2 | 07-14-2003 12:56 AM |
| Re: Any pure-python relational databases? | Thomas Weholt | Python | 1 | 07-13-2003 02:06 AM |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc..
SEO by vBSEO ©2010, Crawlability, Inc. |




