We will investigate an initial example of a compositional signal system: the system of arabic numerals that refer to natural numbers. (This is in fact the traditional opening example of introductions to the semantics of programming languages. Rarely used in introductions to natural language semantics. But here we go.) The meanings are the natural numbers. What are numbers? Philosophical question of ontology and metaphysics, not the primary focus of the semanticist, numbers exist -- whatever they are -- and we use numerals to refer to them. The meanings of the numerals are NOT the concepts we associated with the numbers. Someone might associate (automatically and reliably) with the number 42 all kinds of rich ideas. But that doesn't make those associations part of the meaning of the numeral "42". The semantics of "42" is simply that it refers to the number 42. The basic building blocks (the "lexicon" of smallest meaning bearing units, "morphemes") are the ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. What is the structure of complex numerals? What information about the structure do we need to compute the meaning of a particular numeral? For computing the meaning, it is not enough to provide the semantics with the set of digits contained in a numeral, since {1,2} = {2,1} = {1,1,2} = {2,2,2,1,1,2}. So, what is the syntax of complex numerals and how much of it does the semantics need to know? ----- As a first introduction of how a compositional semantics is given for a language, we explore the semantics of the arabic numeral system. Structure of the system: * The meanings are the natural numbers. * The signals/symbols are ordered sequences of arabic numerals (0,1,2,3,4,5,6,7,8,9) * The smallest component parts (morphemes?) are those 10 numerals Note that it is non-trivial to identify the morphemes of a language. Theoretically, for example, one could have thought that the circle that is part of the symbols 6, 8, and 9 has a meaning by itself that combines with the rest of the symbol to make the meaning of the whole. We know that this is not so, but just because we know the language of arabic numerals so well. For a natural language, doing morphemic analysis is an indispensable and non-trivial prerequisite for doing semantics (and syntax). * Their meaning is obvious * Some notation: [[1]] = 1 * Comment on notation: open faced or double brackets * Comment on apparent circularity When we say that the meaning of the numeral "1" is the number 1, our statement seems circular but that is not so. It just appears circular because we're using the same language both as the object language (the language we're studying as the object of our scientific inquiry) and as the meta-language (the language we're using to formulate our analysis). When object language and meta-language are truly different, these kinds of statements are obviously non-trivial, such as when we say: The meaning of the roman numeral "C" is the number 100. * What are numbers? Philosophical question of ontology and metaphysics, not the primary focus of the semanticist, numbers exist -- whatever they are -- and we use numerals to refer to them * Complex numerals: how do they work? let's assemble a full semantics for complex numerals. The set of numerals (SN) is defined by two rules: every digit is a numeral; and every numeral put together with a digit is a numeral. See the pdf handout for a concise formulation of the syntax and semantics of arabic numerals. The handout also contains a rather tricky question that we'll talk about in the next class. ----- We have developed a compositional semantics for arabic numerals. Crucially, we were using a left-branching syntax. Could we do a compositional semantics with a right-branching syntax? It turns out that it is pretty much impossible. A right-branching syntax builds numerals as follows: SN -> D SN -> D SN That is, it builds more complex numerals by taking a numeral and adding a digit on the left. But how do we compute the meaning of the new complex numeral from the meaning of the digit on the left and the meaning of the numeral on the right? * What we need is to know how many digits the numeral on the right has. If it has two digits, for example, then the meaning of the new numeral is reached by multiplying the meaning of the digit by 100 and adding to that the meaning of the numeral. If it has three digits, we need to multiply the digit by 1000, etc. * But the number of digits in a numeral cannot be deduced from its meaning, as pointed out in class by Chieu (is that right? KvF), because the numeral "7" and the numeral "07" have the same meaning, the number 7. But when a digit is prefixed to the numeral "7", it is multiplied by ten, while when prefixed to the numeral "07", it needs to be multiplied by 100. * So, we would have to somehow hack things so that the information of how many digits the right-hand numerals has is visible to the meaning composition rule. This could be done in two ways: (i) enriching the syntactic information, by for example giving different category labels to numerals of different lengths (for example: "7" is an SN-1 while "07" is an SN-2), which would complicate the syntax in a way that seems only needed because of the needs of the semantics; (ii) enriching the meanings of numerals, by for example saying that numerals don't just have numbers as their meanings but pairs of a number and a length (for example: "7" means <7,1> while "07" means <7,2>). * Both of those "solutions" are not needed if we assume a left-branching syntax. So, overall simplicity considerations favor a left-branching syntax. Once we do natural language semantics, syntax-internal considerations might favor a structure that creates problems for the semantics, but c'est la vie. Here, with the numerals, we can let the semantics decide. Note: the different consequences of syntactic assumptions for the compositional semantics of numerals are discussed in a famous paper on compositionality: Wlodek Zadrozny. 1994. "From Compositional to Systematic Semantics". Linguistics and Philosophy, 17(4): 329–342. [doi:10.1007/BF00985572|http://dx.doi.org/10.1007/BF00985572]. Numerals in Natural Language ---------------------------- Fascinating topic. See the following for lots of cool stuff: Karl Menninger. 1992. Number Words and Number Symbols: A Cultural History of Numbers. Courier Dover. [http://books.google.com/books?id=YLJb6-OyUIQC] Consider how to say 237 in English and in German: "Two hundred (and) thirty seven" "Zwei hundert (und) sieben-und-dreizig" Notice that both multiplication (two times (one) hundred, three times ten (ty/zig)) and addition (thirty plus seven) are used. Lots of weirdnesses: why does French say 80 as "four twenties" (quatre vingt)? For a recent formal study of the semantics of complex numeral expression in natural language, see Tania Ionin & Ora Matushansky. 2006. "The Composition of Complex Cardinals". Journal of Semantics, 23(4): 315–360. [doi:10.1093/jos/ffl006|http://dx.doi.org/10.1093/jos/ffl006].