summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
Diffstat (limited to 'README')
-rw-r--r--README182
1 files changed, 181 insertions, 1 deletions
diff --git a/README b/README
index f73834f..bc1507b 100644
--- a/README
+++ b/README
@@ -1,3 +1,183 @@
+======================================================================
+ Rust, Emacs, and Parsers
+======================================================================
+
+About
+-----
+
This is a parser generator to be used as an Emacs dynamic package.
+First I shall explain how to compile this package, and then I shall
+explain briefly the background information and how to use this
+package.
+
+COMPILE
+-------
+
+Download compiled library
+=========================
+
+// TODO: Make the website.
+
+One can download a compiled dynamic library from the website:
+<https://jsdurand.xyz/rep.html>.
+
+Manual compilation
+==================
+
+First clone this package:
+
+ git clone https://git.jsdurand.xyz/rep.git
+
+Then go into the cloned directory:
+
+ cd rep
+
+[ Make sure the Rust tools are already installed. If not, then
+ install the Rust compiler tools from the instructions on the website
+ <https://www.rust-lang.org/tools/install>. ]
+
+Then run the autotools:
+
+ autoreconf -vif
+
+// TODO: Make the two build options and fix make.
+
+Now there are two build options: debug and release. The debug build
+is for debugging the package, so turns off optimisations and includes
+debug information. This is not recommended for normal use. The
+release build is the default, and is more efficient than the debug
+build.
+
+If one wants to use the debug build, run:
+
+ ./configure --debug
+
+Otherwise just run:
+
+ ./configure
+
+Finally run 'make' to build the package:
+
+ make
+
+Now there should be a file 'rep.so' located in the 'src' directory.
+
+Build on Windows
+================
+
+Since autotools do not work well for windows, from my experiences,
+users of windows need another convenient way to compile the package.
+So I wrote a script to automate this process: just run
+
+// TODO: Figure out how to do this.
+
+Test
+----
+
+The package contains an Emacs Lisp file 'src/test.el' that
+demonstrates a simple scenario for testing. Simply open the file in
+an Emacs instance and run the Emacs Lisp forms one by one to see if it
+works. :D
+
+======================================================================
+ About parser generators
+======================================================================
+
+Grammars
+--------
+
+Before explaining what parsers are, we need first to explain what
+grammars are.
+
+In short, a grammar can be viewed as a "substitution machine". That
+is, a grammar consists of a set of variables, constants, and rules
+
+ X => Y ... Z
+
+where X is a variable and Y ... Z are either variables or constants.
+The grammar then substitutes every occurence of X by the sequence at
+the right-hand side Y ... Z. And there is a "starting variable",
+usually denoted as "S".
+
+To sum up, we start from the sequence "S", and perform substitutions
+according to the rules. If the substituted sequence cannot be further
+substituted, i.e. if the sequence consists of constants only, we say
+that this constant-only sequence is "generated" by the grammar.
+
+One can think of this substitution machine as macros in TeX, except
+that this only allow macros without arguments. Or one can think of
+the machine as macros in programming languages, again without
+arguments.
+
+Notice that a variable can have multiple rules to substitute, and a
+given sequence can be generated by different sets of substitutions.
+
+Parsers
+-------
+
+Parsers are "inverses" of grammars. More precisely, while grammars
+perform substitutions to generate sequences, parsers try to find out
+what substitutions the grammar performed in order to generate a given
+sequence. Since there might be multiple ways to generate a given
+sequence, the parser needs to find all possible ways.
+
+In fact, the number of ways to generate a sequence can even be
+infinite. For example, if the grammar has the following two rules:
+
+ S => S
+ S => ""
+
+This grammar means that S can be subtituted by S itself, and it can be
+substituted by the empty sequence as well. In this case, the grammar
+can only generate the empty sequence, but it can generate the empty
+sequence by infinitely many ways: it can first substitute S by itself
+any number of times, until it substitutes S by the empty sequence.
+
+Moreover, even if we do not consider the generations that produce
+empty sequences, the number of ways a grammar can generate a given
+sequence is in general O(n^3), where "O" means the big-O notation, and
+n is the length of the sequence to be generated. This is why general
+parsers are hard to construct for every grammar.
+
+Parser Generators
+-----------------
+
+Now we come to the topic of this package: parser generators. Simply
+speaking, a parser generator takes a grammar as input, and produces a
+parser for that grammar as output.
+
+One can classify parser generators into two types: comiled and
+interpreted. The former means that the generator produces source
+codes of a parser that needs to be compiled to produce a parser. The
+latter means that the generator simply produces a parser that needs to
+be run by an interpreter. The well-known "tree-sitter" parser
+generator belongs to the first category, for example.
+
+This package is interpreted. But there is a plan to also turn on a
+"compiled mode" to potentially make the generated parsers more
+performant.
+
+Augmented Backus-Naur Form
+--------------------------
+
+There are various ways to write grammars. This package chooses the
+augmented backus-naur form. See Wiki for an overview:
+<https://en.wikipedia.org/wiki/Augmented_Backus-Naur_Form>, and see
+RFC 5234 for its detailed definition:
+<https://datatracker.ietf.org/doc/html/rfc5234> or
+<https://www.rfc-editor.org/rfc/rfc5234.txt>.
+
+The reason for this choice is that the RFC documents use this format,
+so there are a lot of example grammars.
+
+There is also a plan to let the users choose their own favourite
+format, of course.
+
+Algorithm
+---------
+
+This package uses an algorithm that is not widely adapted, at least
+not to my knowledge. So I think I will briefly explain how it works.
+
-TODO: Provide more details. \ No newline at end of file
+TODO: Provide more details.