|On philology, potatoes and construction.|
Well, this is just my first approach to blog-writing. I want it to be the way to keep in touch with colleagues and friends.
|It rained too often for the early season this year, the soil was too wet to plant. Harvesting started later than usual. I took part. We used two harvesters, the conveyor belt of the first one broke in the first rows. Grass was too thick and kept the soil from moving onto the trail. So did the second. And the first, once repaired again. Finally, plowing the land dug the weeds under the soil. After that, one harvester was enough to do the job. A crop of medium sized healthy tubers is a good end for a more difficult than usual start. |
|The summer ended and autumn came while working in the refurbishment of a shop in a world heritage city. |
Some new potatoes, the ones planted earlier, have been already harvested (I came back in time). As for the late season, most of the plants have flowered. Their leaves slightly wet yesterday, early in the morning. It drizzled the night before, softly, for a while, as mist. So did yesterday, once more. Enough to refresh the surface, still too superficial to reach the root.
|Ice on the river|
Once solid soil to step on, ice melts again.
Winter is gone, in the river the water runs.
|How long is a league?|
We are talking about length, of course. Though, even after disambiguation, the answer may change depending on our definition. If we take metres to be the standard unit of length, “a unit of length equal to X metres” could be an acceptable definition. So easy, isn't it? Now we only have to answer the question, how many metres are there in a league?
Here is where we could get lost trying to be more precise again. Conversion values change from one place to another since the early periods of history. Furthermore, we would first need to use standard units other than metres, such as stades and feet, then convert to metres again.
So, here we are, tracking Mendes Pinto with distances given in leagues. The issue is that, after working with a huge bibliography, I haven't found a definitive explanation on how the values were calculated if any values were given at all. The most secure point I got is that of a league being equal to 17.5 intervals of a degree.1
Having this as a given value, I approached the issue as follows:
- Mendes Pinto reports measures for distance and position being taken after the sun.
- Now, what Fernão Mendes Pinto really measures is an angle of the Earth's circumference, taking relative positions on its surface with the sun as a point of reference.
- The value of the sphere is well known for this period (Pedro Nunes, Sacrobosco), as nowadays, it has 360 degrees.
- Now, FMP is giving distances proportional to 360/17.5. FMP doesn't need to know how long planet Earth is on the meridian, he just measures parts of it and calls a unit of length in his path a league. This universal unit would be finally transferable to whatever the local common unit of length, be it stades, feet, and so on.
- Since we want to know the value nowadays, we don't need to solve how many stades or feet a league has either. It is enough to know the value of what FMP measures in metres, that is, the circumference of Earth, which is 40,000,000 m (round number, so a metre can also be exactly defined after the distance between the poles and the equator).2
- Now, the maths. 40,000,000 m % 360° = 111,111.1 m, that we further divide 111,111.1 % 17.5 = 6,349 m (6,350 m for the length of Earth's circumference being equal to 40,007,863 m.)
Therefore, following this approach, a league is 6.349 km.
Let's check it! According to FMP, distance from Nanjing to Beijing is 180 leagues. So 180 leagues x 6.349 km = 1,142.82 km. Actual distance, round number, 1,150 km!3
More examples needed to actually bring evidence. Close enough to be on the right track!
Albuquerque, L. (1987). As navegações e a sua projecção na ciência e na cultura
. Lisboa: Gradiva. P. 49. ^
Wikipedia. Sub voce Earth
. http://en.wikipedia.org/wiki/Earth ^
Alves, J. (ed.) (2010). Fernão Mendes Pinto and the Pegrinação
. Lisbon: Fundação Oriente. (Notes by M. Ollé, Vol. III, p. 126) ^
|More on Travels|
Last week I had the privilege to participate in a colloquium again. The topic was place names identification in Mendes Pinto's travels.
This is a synopsis:
"We present here our current work on a positivist analysis of place names. It aims to be valuable for either a literary reading or a more strict historical and geographical interpretation of Pinto's work. We sketch three methods to trace a place name.
1) Through phonetic analysis, does a given place name match a geographical entity in an Asian language?
2) By examining context, do we find historical, ethnographic, topographic, architectural, any descriptive features to relate the place name to a particular geographical area?
3) Given a point of reference and a vector of displacement, is it possible to solve the place name through cartography? "
A pdf file with slides is available here
Many miles away, autumn may-trees,
bore sweeter berries, less lobulated leaves.
|Only one more day. Yet one year more.|
|Having been away for ten months, there is not much for me to do this time of the year. I just offer a helping hand to cut and pick up some logs from trees the wind brought down and spend two more days harvesting potatoes on two small pieces of land.|
|Two days only, as a visitor, to see the final building process of a modular house still indoors, another day to visit a house yet to be finished, though already installed on the building site. |
|Is human language a system made of an infinite number of expressions? (and II)|
Allowing all possible combinations for the whole English vocabulary, using one of the longest sentences ever registered in this language, was initially solved to infinity. As these were the expected results, being urged to write my presentation, I sent a draft to Miro Moman
who kindly and quickly pointed out that my last operation was rather 1.63585694x1023086
. This is a very huge number indeed (compare with the upper bound of the physical universe
), though still finite. All right again, who cares about a number which is bigger than the volume of the physical universe? Isn't that infinite enough?
Well, it is all right not to care that much. As I told you when this blog started six years ago, when we were only 6 billion people in this planet, in this section I deal with subjects that could be of interest for four or five people in the whole of humanity at present time. We are more than 7 billion now, the number could have slightly increased... though only by one unit the most if we keep the proportion we had six years ago.
Yet, the issue is, 1.63585694x1023086
stands for the result of applying maxima that over-represent the combinatory potential of any human language. That is, a more approximate value will always be smaller if we only apply syntactical rules! An easy example, this unrestricted combinatory would allow the same word to be repeated up to 4391 times and yet would consider the resulting string a sentence.
Let's go small to try to understand better. Let's take the first branch of the Mabinogi from our corpus
: 1605 word-types
, very small lexicon, the longest sentence has 64 words
(this well represents a high value for sentence length). If we allow all possible combinations, that is, the same word to appear in any position of the sentence, we get 160564
, still bigger than the volume of the universe! However, you will soon notice that with such a small lexicon the number of grammatical sentences (not to tell you if we add semantics) must be finite and for sure much smaller!
You can tell me, what about recursion, and adding a loop that infinitely embeds a sentence within a sentence (using a complementizer in English, for instance)? ... all right, go on, you can move towards infinity as much as you want, and indeed create the longest sentence ever... though only when you reach an end (a sentence has to be complete to be a sentence) you will have a sentence unit, a single one. More important, at that point, even if you got a result bigger than 1.63585694x1023086
, it would be finite again.
So, sentences are more similar to the lexicon than I previously thought. As far as I understand, the set of sentences in any human language is finite
, boundless as the vocabulary of a language, and very huge, though much smaller than the maxima given above. Following the more manageable example for the Mabinogi corpus, using it as a rough extrapolation, it would be a matter of adding rules to come down to our solar system and begin to get a number at least smaller than the volume of our physical universe.
|Is human language a system made of an infinite number of expressions? (I)|
Last week I had to give a conference about my research.
This was the introduction:
“Human language can be viewed as a system made of a finite number of units from where an infinite number of expressions is generated. Given any particular language, we have a finite and small number of phonemes, the distinctive sound units, that can be combined to form thousands of morphemes, with no upper bound to create new entries in the lexicon. Words themselves are combined to make up an infinite number of sentences.
When we learn a new language, or when we start to learn to speak, we are expected to learn the words that are more widely used first. In corpus linguistics we refer to this statistical property of words as word frequency. Our model focuses on a practical use of words based on their frequency. Words that have a higher frequency, that is, more occurrences in a given corpus we choose to represent a particular and real use of a language, appear in a higher number of sentences, hence they would be more useful to understand more sentences in a given language.
In order to grasp the meaning of a sentence, we also need to understand its syntax, the special relations that bind words together. Our model uses a very basic approach to syntax. We attempt to approximately determine which sentences would involve more processing in order to be efficiently parsed using words as units, sentence length the quantitative variable. We hence understand a sentence as a discrete variable made up of a number of words (tokens). We consider the number of words to be relevant for sentence complexity, defined here as the higher or lower number of syntactic rules we would need to efficiently parse a sentence. Long sentences involve more syntactic relations, so they are more difficult to parse than the shorter ones.”
On the process of writing the presentation I began to work on an easy explanation on how from an initial finite number of phonemes, a countable number of syllables is formed to make up a finite, though boundless, number of words in a language. From there, I wanted to show that the number of sentences was infinite, so I made some basic operations taking maxima as values:
* Leaving obsolete words apart, the total vocabulary in a dictionary is considered relevant for combinatory purposes. Let's take Oxford dictionaries
as an example: 171,476 words in current use + around 9,500 derivatives: 180,976 words.
* Only lexemes are considered. Inflectional word-types and derivatives not listed in the dictionary are not counted.
* No grammatical restrictions are applied for the combination of words in a sentence.
* A sentence with an exceptional high number of words in English language is taken as the maximum value for the number of words in a sentence: Molly's monologue is 4,391 words (James Joyce).
The picture above shows the result of the final operation.