Send
Close Add comments:
(status displays here)
Got it! This site uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.nbsp; Note: This appears on each machine/browser from which this site is accessed.
EBNF grammars
1. EBNF grammars
2. Grammar
A grammar is an allowable sequence of tokens, or symbols, in a program.
3. Syntax of languages
A
syntax of a language specifies the legal symbols and arrangement of symbols of the language.
Syntax ≡ Allowable sequence of symbols (what you say).
4. Semantics of languages
A
semantics of a language is the meaning of a language specified with a given syntax. By analogy, syntax is what you say, semantics is what you mean. And sometimes there is a difference.
Semantics ≡ Meaning of a sequence of symbols (what you mean).
5. Example
Syntax: Your course grades are 1.0, 0.0, 1.5, and 1.0. According to the University bulletin, these are legal values that grades can have. Their syntax is ok.
Semantics: You could have issues with your academic progress. The semantics is not ok.
6. Identifiers and numbers
What is an identifier?
What is a number?
What is a letter?
What is a digit?
7. Informal definitions
Here are some informal (ambiguous) definitions.
An identifier is a letter followed by zero or more letters or digits.
A number is a digit followed by zero or more digits.
A letter is a character from 'a' to 'z' or from 'A' to 'Z'. (what about underscore?)
A digit is a character from '0' to '9' . (what about hex digits?)
Let us use EBNF to be a little more precise.
8. EBNF
Here is the associated
EBNF (
Extended Backus Naur Form) grammar.
Number = "digit" { "digit" } .
Identifier = "letter" { LetterOrDigit } .
LetterOrDigit = "letter" | "digit" .
EBNF is a written form of a grammar.
Terminals cannot be subdivided; they appear within double quotes.
Nonterminals or Variables can be divided into smaller parts; they appear as identifiers.
9. Productions
Here is the associated
EBNF grammar.
Number = "digit" { "digit" } .
Identifier = "letter" { LetterOrDigit } .
LetterOrDigit = "letter" | "digit" .
A
Productions has a
Variable and consists of zero or more
Terminals or
Variables.
In the above grammar, the productions are
Number,
Identifier, and
LettorOrDigit.
In a full grammar all productions are defined. In grammar fragments (as in these notes), some productions may not be defined (or may be defined elsewhere).
10. Syntax charts
A syntax chart is a visual depiction of the productions of a grammar.
Here is the associated
EBNF syntax diagram(s).
EBNF is a written form of a grammar that can also be depicted visually as a syntax diagram.
Syntax charts are a graphical form of a grammar.
Terminals cannot be subdivided; they appear in ovals.
Nonterminals or Variables can be divided into smaller parts; they appear in boxes.
Productions consist of zero or more Terminals or Variables.
11. Terminal symbols
Terminal symbols of a language are symbols that cannot be broken up. A subset of these symbols in a language are often designated as reserved words. In
SQL (
Structured Query Language) , for example, some reserved words include the following.
BY
FROM
ORDER
SELECT
WHERE
... and so on ...
12. Reserved words
A reserved word is a word in the language that has a special meaning and cannot be used in a program as a programmer-defined or library-defined identifier or name except as specified by the rules of the programming language.
13. Reserved words for Java
Here are some reserved words for Java.
abstract boolean break byte case catch char class const continue default do double else extends final finally float for goto if implements import instanceof int interface long native new package private protected public return short static super switch synchronized this throw throws transient try void volatile while
Do you see any that you do not recognize?
14. Reserved words for C++
Here are some reserved words for C++.
asm auto break case catch char class const continue default delete do double else enum extern float for friend goto if inline int long new operator private protected public register return short signed sizeof static struct switch template this throw try typedef union unsigned virtual void volatile while
Do you see any that you do not recognize?
Program grammars can be expressed in many equivalent forms.
Some are easier to work with and/or understand in various contexts (human view, machine view, etc.).
The syntax of a language itself, such as SQL, Java, Python, C++, etc., is often specified using a special notation (or a variant of it) called
EBNF that uniquely specifies what is and what is not allowable syntax.
15. Meta-characters
Meta-characters are used to indicate how often a sequence of symbols can appear:
[ ] (square brackets) are used for zero or one times.
{ } (curly braces) are used for zero or more times.
( ) (parentheses) are used for grouping.
| (vertical bar) is used for alternatives.
EBNF makes use of meta-characters to specify the syntax as follows.
A required
Expression is written as
Expression
and means that
Expression is required in this local context, unless any meta-characters (explanation to follow) surround
Expression in a more global context.
An optional
Expression is written between square brackets as
[
Expression ]
and means that
Expression can be repeated zero or one times.
16. Repetition
A repetitive
Expression is written between curly braces as
{
Expression }
and means that
Expression can be repeated zero or more times.
17. Selection
A selection from one of
E1 or
E2 or
E3 is written with the vertical bar as
E1 |
E2 |
E3
and means that one of
E1 or
E2 or
E3 is selected.
Most syntax is specified using
EBNF of a variant of
EBNF . Always read the legend to insure that you are interpreting the syntax specification correctly.
18. EBNF grammar
Here is the associated
EBNF grammar.
Query = SelectPart FromPart [ WherePart ] [ OrderPart ] .
SelectPart = "SELECT" SelectList .
SelectList = "*" | FieldList .
FromPart = "FROM" Table .
OrderPart = "ORDER" "BY" FieldList .
WherePart = "WHERE" Expression .
FieldList = Field { "," Field } .
19. Syntax diagram
Here is the associated
EBNF syntax diagram(s).

20. Copy-update abstraction
The copy-update problem in the form of redundant specifications is handled by giving an expression a name and, thereafter, using that name in place of the expression. For example, the following is a subset of SQL expressed using
EBNF (notice the use of the '
=' and '
.' meta-characters).
21. Semantic features
In this case,
Query is defined in terms of
FieldList Table, and
Expression, while
FieldList is defined in terms of
Field. And so on.
Some semantic features of this syntax include the following.
What is a Table.
What is a Field.
Does Field have to be defined?
What does '*' mean?
For the present purposes, we are only interested in grammars and syntax, not semantics.
Why is
EBNF used?
EBNF allows the syntax of a language to be easily specified. This has two consequences.
The user of the language has a (fairly) unambiguous specification of the syntax of the language that can be consulted to help determine syntax errors.
The recognizing of the language, a process called parsing, can be automated (using a computer).
22. End of page
23. Acronyms and/or initialisms for this page
2 acronyms omitted (login required)