![]() | This article has not yet been rated on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||
|
Added a disadvantages section and edited the advantages. This article is in need of being balanced, scannerless parsing is a technique which makes sense in limited circumstances, usually when the language being parsed is very simple. Remember, there is a reason the lexer/parser distinction was made in the first place.
In particular:
12.165.250.13 (talk) 20:54, 28 May 2008 (UTC)
I've been observing this article for a while, and I've been dismayed at how poor the article still is. It contains a number of factual mistakes and does not really explain anything. I'm reluctant to improve the article myself though, because I'm one of the researchers publishing on the merits of scannerless parsing. I have a few problems with the article:
Martin Bravenboer (talk) 15:11, 30 May 2008 (UTC)
I also think this article contains many errors. For example the advantage 'Grammars are compositional' is not related to Scannerless parsers: it is theoretically possible to have an LL(1)-parser that is scannerless, and LL(1) is not closed under composition. The third page of this paper describes it: http://www.springerlink.com/content/xugat38tyrxvtm9w/. So compositional grammars seem to be a feature of generalized parsers, instead of scannerless. It is related, because most scannerless parsers are generalized.
Mbvlist (talk) 10:37, 10 July 2009 (UTC)
I was quite gratified to discover this page, which broaches a topic I'd wondered about for ages, but never saw discussed. However, I regret that I was chagrined to discover that while it is replete with admirably many links to examples of the concept's use, it contains many unsupported general assertions yet cites at best only one external discussion of the theory. (And Visser's report does not pretend to cover the whole domain - just one solution).
I personally consider some of the unsupported assertions to be plausible. However even the bald assertions which I haven't singled out deserve to be backed up with cites.
For all I know, someone who well understands the topic might find many justifiable cites in Visser's own references. AHMartin (talk) 19:23, 28 December 2016 (UTC)
Removing the section because it seems quite nonsensical. Problems with the section and topic:
References
In the late 1960s, early 70s CWIC, Compiler for Writing and Implementing Compilers, was developed at Systems Development Corporation. The CWIC system included three compiled languages:
SYNTAX (Parser Programming) language is used to program a tree transforming syntax recognizer for the language. GENERATOR (Code Production) language is used to generate 1machine code. MOL360 (Machine Orianted Language for IBM360) A block structured assembler,
It is the parser programming language that is of interest.
I developed SLIC System of Languages for Implementing Compilers based partly on CWIC.
I am not sure what exactly parser type they are other then recursive decent. Does the ability to recognizing a sequence with a loop or recursion make a difference as to their type. It does make a difference as to the tree that is produced.
expr = term $(('+':ADD|'-':SUB) term!2);
!<number> pops the top node object and the <number> of parse stack entries into a list representing a <number> branch tree.
a+b-c => [SUB,[ADD,a,b],c]
SUB / \ ADD c / \ a b
Looping generates a tree bottom up left to right.
term = factor $(('*':MPY|'/':DIV)factor!2);
factor = (numbe | id | '(' expr ')') ('^':POW factor!2|--);
A^B^C =>
POW / \ A POW / \ B C
POW,[A POW[B,C]]
We control tree construction using tail recursion or looping.
OR we could just gather parsed entries into a lidt:
arguments = +[arg $(',' arg) | .EMPTY]+
I believe these are far easier to use than a parser generator.
How about making a list of terns:
expr = +[term $('+' term|'-':SUB term!1):SUM!1;
a+b-c+5 => SUM[a,b,SUB[c],5]
Token '..' formula are called to recognize tokens. Recognized tokens by default are cataloged into the symbol table.
In CWIC tokens are defined using
that reference character class ':' formula:
bin: '0'|'1'; oct: bin|'2'|'3'|'4'||'5'|'6'|'7'; dec: oct|"8'|'9';
CWIC developed at Systems Development Corporation. included three languages, SYNTAX(parser programming language), GENERATOR(Tree crawling code production language), and MOL360(Machine Oriented Language for IBM360 processors). In the SYNTAX language you program a top-down LR parser directing the parse tree's construction. Recognizing tokens with token ".." formula that reference character class formula that test for characters of the class. Examples 216.146.244.139 (talk) 05:57, 30 April 2023 (UTC)