(Arne Babenhauserheide)
2014-12-23: html3.2 up to syntax overview (including) html3.2 up to syntax overview (including)
diff --git a/docs/srfi-from-template.html b/docs/srfi-from-template.html --- a/docs/srfi-from-template.html +++ b/docs/srfi-from-template.html @@ -3,7 +3,7 @@ <head> <title>SRFI ?: wisp: simpler indentation-sensitive scheme</title> </head> - + <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <body> <H1>Title</H1> @@ -12,24 +12,260 @@ wisp: simpler indentation-sensitive sche <H1>Author</H1> -??? The author(s) +<ul> +<li>Arne Babenhauserheide +</li> +</ul> + +<h3>Acknowledgments</h3> +<ul> +<li>Thanks for many constructive discussions goes to Alan Manuel K. Gloria and David A. Wheeler. +</li> +<li>Also thanks to Mark Weaver for his help with the wisp parser and the guile integration - including a 20x speedup. +</li> +</ul> <H1>Abstract</H1> -??? 200-500 word abstract +<p> +This SRFI describes a simple syntax which allows making scheme easier to read for newcomers while keeping the simplicity, generality and elegance of s-expressions. Similar to SRFI-110, SRFI-49 and Python it uses indentation to group expressions. Like SRFI-110 wisp is general and homoiconic. +</p> + +<p> +Different from its precedessors, wisp only uses the absolute minimum of additional syntax-elements which are required for writing and exchanging arbitrary code-structures. As syntax elements it only uses a colon surrounded by whitespace, the period followed by whitespace as first code-character on the line and optional underscores followed by whitespace at the beginning of the line. +</p> + +<p> +It resolves a limitation of SRFI-110 and SRFI-49, both of which force the programmer to use a single argument per line if the arguments to a function need to be continued after a function-call. +</p> + +<p> +Wisp expressions can include any s-expressions and as such provide backwards compatibility. +</p> + +<table><tr><th>wisp</th><th>s-exp</th></tr><tr><td> +<pre><b>define</b> : <i>hello</i> who + <i>format</i> #t "~A ~A!\n" + . "Hello" who +<i>hello</i> "Wisp" +</pre> +</td><td> +<pre>(<b>define</b> (<i>hello</i> who) + (<i>format</i> #t "~A ~A!\n" + "Hello" who)) +(<i>hello</i> "S-exp") +</pre> +</td></tr></table> <H1>Issues</H1> -??? Optional section that may point out things to be resolved. This - will not appear in the final SRFI. +<ul><li>wisp-scheme: Does not recognize the <code>. #!curly-infix</code> request for curly-infix or other reader syntax.</li> +<li>wisp-scheme: REPL: sometimes the output of a command is only shown after typing the next non-empty line.</li></ul> <H1>Rationale</H1> -??? detailed rationale +<p>A big strength of Scheme and other lisp-like languages is their minimalistic syntax. By using only the most common characters like the period, the comma, the quote and quasiquote, the hash, the semicolon and the parens for the syntax (<code>.,"'`#;()</code>), they are very close to natural language.<a href="#common-letters" name="common-letters-reference">⁽¹⁾</a> Along with the minimal list-structure of the code, this gives these languages a timeless elegance.</p> + +<p>But as SRFI-110 explains very thoroughly (which we need not repeat here), the parentheses at the beginning of lines hurt readability and scare away newcomers. Additionally using indentation to mark the structure of the code follows naturally from the observation that most programmers use indentation, with many programmers letting their editor indent code automatically to fit the structure. Indentation is an important way how programmers understand code and using it directly to define the structure avoids errors due to mismatches between indentation and actual meaning.</p> + +<p>As a solution to this, SRFI-49 and SRFI-110 provide a way to write whitespace sensitive scheme, but both have their share of issues.</p> + +<p>As noted in SRFI-110, there are a number of implementation-problems in SRFI-49, as well as specification shortcomings like choosing the name “group” for the construct which is necessary to represent double parentheses. In addition to the problems named in SRFI-110, SRFI-49 is not able to continue the arguments to a function on one line, if a prior argument was a function call. The following example shows the difference between wisp and SRFI-49 for a very simple code snippet:</p> + +<table><tr><th>wisp</th><th>SRFI-49</th></tr><tr><td> +<pre> + <i>*</i> 5 + <i>+</i> 4 3 + . 2 1 +</pre> +</td><td> +<pre> + <i>*</i> 5 + <i>+</i> 4 3 + 2 + 1 +</pre> +</td></tr></table> + +<p>Here wisp uses the leading period to mark a line as continuing the argument list.<a href="#period-concept" name="period-concept-reference">⁽²⁾</a></p> + +<p>SRFI-110 improves a lot over the implementation of SRFI-49. It resolves the group-naming and reduces the need to continue the argument-list by introducing 3 different grouping syntaxes (<code>$</code>, <code>\\</code> and <code><* *></code>). These additional syntax-elements however hurt readability for newcomers (obviously the authors of SRFI-110 disagree with this assertion. Their view is discussed in SRFI-110 in the section about wisp). The additional syntax elements lead to structures like the following (taken from examples from the readable project):</p> + +<pre> +<i>myfunction</i> + x: \\ original-x + y: \\ <i>calculate-y</i> original-y +</pre> + +<pre> +<i>a</i> b $ <i>c</i> d e $ <i>f</i> g +</pre> + +<pre> +let <* <i>x</i> <i>getx</i>() \\ <i>y</i> <i>gety</i>() *> +! {{x * x} + {y * y}} +</pre> + +<p>This is not only hard to read, but also makes it harder to work with the code, because the programmer has to learn these additional syntax elements and keep them in mind before being able to understand the code.</p> + +<p>Like SRFI-49 SRFI-110 also cannot continue the argument-list without resorting to single-element lines, though it reduces this problem by the above grouping syntaxes and advertising the use of neoteric expressions from SRFI-105.</p> + +<h2>Wisp example</h2> + +Since an example speaks more than a hundred explanations, the following shows wisp exploiting all its features - including curly-infix from SRFI-105: + +<pre> +<b>define</b> : <i>factorial</i> n +__ <b>if</b> : <i>zero?</i> n +____ . 1 +____ <i>*</i> n : <i>factorial</i> {n - 1} + +<i>display</i> : <i>factorial</i> 5 +<i>newline</i> +</pre> + +<h2>Advantages of Wisp</h2> + +<p>Wisp draws on the strength of SRFI-110 but avoids its complexities. It was conceived and improved in the discussions within the readable-project which preceded SRFI-110 and there is a comparison between readable in wisp in SRFI-110.</p> + +<p>Like SRFI-110, wisp is general and homoiconic and interacts nicely with SRFI-105 (neoteric expressions and curly infix). Like SRFI-110, the expressions are the same in the REPL and in code-files. Like SRFI-110, wisp has been used for implementing multiple smaller programs, though the biggest program in wisp is still its implementations (written in wisp and bootstrapped via a simpler wisp preprocessor).</p> + +<p>But unlike SRFI-110, wisp only uses the minimum of additional syntax-elements which are necessary to support arbitrary code-structures with indentation-sensitive code which is intended to be shared over the internet. To realize these syntax-elements, it generalizes existing syntax and draws on the most common non-letter non-math characters in prose. This allows keeping the actual representation of the code elegant and inviting to newcomers.</p> + +<p>Wisp expressions are not as sweet as <a href="http://readable.sf.net">readable</a>, but they KISS.</p> + +<h2>Disadvantages of Wisp</h2> + +<p>Using the colon as syntax element keeps the code very close to written prose, but it can interfere with type definitions as for example used in Typed Racket.<a href="#typed-racket" name="typed-racket-reference">⁽³⁾</a> This can be mitigated in let- and lambda-forms by using the parenthesized form. When doing so, wisp avoids the double-paren for type-declarations and as such makes them easier to catch by eye. For function definitions (the only <code>define</code> call where type declarations are needed in typed-racket), a <code>declare</code> macro directly before the <code>define</code> should work well.</p> + +<p>Using the period to continue the argument list is unusual compared to other languages and as such can lead to errors when trying to return a variable from a procedure and forgetting the period.</p> + + +<h2>Footnotes</h2> + +<ul><li><a name="common-letters" href="#common-letters-reference">⁽¹⁾</a> The most common non-letter, non-math characters in prose are <code>.,":'_#?!;</code>, in the given order as derived from newspapers and other sources (for the ngram assembling scripts, see the <a href="http://bitbucket.org/ArneBab/evolve-keyboard-layout">evolve keyboard layout project</a>).</li> + <li><a name="period-concept" href="#period-concept-reference">⁽²⁾</a> Conceptually, continuing the argument list with a period uses syntax to mark the rare case of not calling a function as opposed to marking the common case of calling a function. To back the claim, that calling a function is actually the common case in scheme-code, grepping the the modules in the Guile source code shows over 27000 code-lines which start with a paren and only slightly above 10000 code-lines which start with a non-paren, non-comment character. Since wisp-syntax mostly follows the regular scheme indentation guidelines (as realized for example by emacs), the whitespace in front of lines does not need to change.</li> + <li><a name="typed-racket" href="#typed-racket-reference">⁽³⁾</a> Typed Racket uses calls of the form <code>(: x Number)</code> to declare types. These forms can still be used directly in parenthesized form, but in wisp-form the colon has to be replaced with <code>\:</code>. In most cases type-declarations are not needed in typed racket, since the type can be inferred. See <a href="http://docs.racket-lang.org/ts-guide/more.html?q=typed#%28part._when-annotations~3f%29">When do you need type annotations?</a></li> +</ul> + +<h2>Related SRFIs</h2> +<ul> +<li>SRFI-49 (Indentation-sensitive syntax): superceded by this SRFI, +</li> +<li>SRFI-110 (Sweet-expressions (t-expressions)): alternative to this SRFI, +</li> +<li>SRFI-105 (neoteric expressions and curly infix): supported in this SRFI by treating curly braces like brackets and parentheses. Curly infix is required by the implementation and the testsuite. +</li> +<li>SRFI-30 (Nested Multi-line comments): complex interaction. Should be avoided at the beginning of lines, because it can make the indentation hard to distinguish for humans. SRFI-110 includes them, so there might be value in adding them. The wisp reference implementation does not treat them specially, though, which might create arbitrary complications. +</li> +</ul> + <H1>Specification</H1> -??? detailed specification +<p>The specification is separated into four parts: A general overview of the syntax, a more detailed description, justifications for each added syntax element and clarifications for technical details.</p> + +<h2>Overview</h2> + +<p>The basics of wisp syntax can be defined in 4 rules, each of which emerges directly from a requirement:</p> + +<h3>Wisp syntax 1/4: function calls</h3> + +<p>Indentation:</p> + +<pre> +<i>display</i> + + 3 4 5 +<i>newline</i> +</pre> + +<p>becomes</p> + +<pre> +(<i>display</i> + (+ 3 4 5)) +(<i>newline</i>) +</pre> + +<p><i>requirement: call functions without parenthesis.</i></p> + +<h3>Wisp syntax 2/4: Continue Argument list</h3> + +<p>The period:</p> + +<pre> +<i>+</i> 5 + <i>*</i> 4 3 + . 2 1 +</pre> + +<p>becomes</p> + +<pre> +(<i>+</i> 5 + (<i>*</i> 4 3) + 2 1) +</pre> + +<p>This also works with just one argument after the period. To start a line without a function call, you have to prefix it with a period followed by whitespace.</p> + +<p><i>requirement: continue the argument list of a function after an intermediate call to another function.</i></p> + +<h3>Wisp syntax 3/4: Double Parens</h3> + +<p>The colon:</p> + +<pre> +<b>let</b> + : x 1 + y 2 + z 3 + <i>body</i> +</pre> + +<p>becomes</p> + +<pre> +(<b>let</b> + ((x 1) + (y 2) + (z 3)) + (<i>body</i>)) +</pre> + +<p><i>requirement: represent code with two adjadent blocks in double-parentheses.</i></p> + +<h3>Wisp syntax 4/4: Resilient Indentation</h3> + +<p>The underscore (optional):</p> + +<pre> +<b>let</b> +_ : x 1 +__ y 2 +__ z 3 +_ <i>body</i> +</pre> + +<p>becomes</p> + +<pre> +(<b>let</b> + ((x 1) + (y 2) + (z 3)) + (<i>body</i>)) +</pre> + +<p><i>requirement: share code in environments which do not preserve whitespace.</i></p> + +<h3>Summary</h3> + +<p>The syntax shown here is the minimal syntax required for the goal of wisp: indentation-based, general lisp with a simple preprocessor, and code which can be shared easily on the internet:</p> + +<ul><li><code>.</code> to continue the argument list</li> + <li><code>:</code> for double parens</li> + <li><code>_</code> to survive HTML</li></ul> <H1>Implementation</H1>