Afonso Xavier Canosa Rodriguez

On philology, potatoes and construction.
Well, this is just my first approach to blog-writing. I want it to be the way to keep in touch with colleagues and friends.

info at canosarodriguez dotnet

More on Travels

Last week I had the privilege to participate in a colloquium again. The topic was place names identification in Mendes Pinto's travels.

This is a synopsis:

"We present here our current work on a positivist analysis of place names. It aims to be valuable for either a literary reading or a more strict historical and geographical interpretation of Pinto's work. We sketch three methods to trace a place name.

1) Through phonetic analysis, does a given place name match a geographical entity in an Asian language?

2) By examining context, do we find historical, ethnographic, topographic, architectural, any descriptive features to relate the place name to a particular geographical area?

3) Given a point of reference and a vector of displacement, is it possible to solve the place name through cartography? "

A pdf file with slides is available here.

Subject: philology - Published 24-11-2014 13:49
Permanent link to this article

Many miles away, autumn may-trees,
bore sweeter berries, less lobulated leaves.
Subject: philology - Published 27-10-2014 14:06
Permanent link to this article
...and again
Only one more day. Yet one year more.
Subject: potatoes - Published 18-08-2014 12:00
Permanent link to this article
Harvesting again
Having been away for ten months, there is not much for me to do this time of the year. I just offer a helping hand to cut and pick up some logs from trees the wind brought down and spend two more days harvesting potatoes on two small pieces of land.
Subject: potatoes - Published 28-07-2014 17:27
Permanent link to this article
Modular building
Two days only, as a visitor, to see the final building process of a modular house still indoors, another day to visit a house yet to be finished, though already installed on the building site.
Subject: construction - Published 28-07-2014 16:37
Permanent link to this article
Is human language a system made of an infinite number of expressions? (and II)
Allowing all possible combinations for the whole English vocabulary, using one of the longest sentences ever registered in this language, was initially solved to infinity. As these were the expected results, being urged to write my presentation, I sent a draft to Miro Moman who kindly and quickly pointed out that my last operation was rather 1.63585694x1023086. This is a very huge number indeed (compare with the upper bound of the physical universe, 1x10113 m3), though still finite. All right again, who cares about a number which is bigger than the volume of the physical universe? Isn't that infinite enough?

Well, it is all right not to care that much. As I told you when this blog started six years ago, when we were only 6 billion people in this planet, in this section I deal with subjects that could be of interest for four or five people in the whole of humanity at present time. We are more than 7 billion now, the number could have slightly increased... though only by one unit the most if we keep the proportion we had six years ago.

Yet, the issue is, 1.63585694x1023086 stands for the result of applying maxima that over-represent the combinatory potential of any human language. That is, a more approximate value will always be smaller if we only apply syntactical rules! An easy example, this unrestricted combinatory would allow the same word to be repeated up to 4391 times and yet would consider the resulting string a sentence.

Let's go small to try to understand better. Let's take the first branch of the Mabinogi from our corpus: 1605 word-types, very small lexicon, the longest sentence has 64 words (this well represents a high value for sentence length). If we allow all possible combinations, that is, the same word to appear in any position of the sentence, we get 160564 = 4.5x10189, still bigger than the volume of the universe! However, you will soon notice that with such a small lexicon the number of grammatical sentences (not to tell you if we add semantics) must be finite and for sure much smaller!

You can tell me, what about recursion, and adding a loop that infinitely embeds a sentence within a sentence (using a complementizer in English, for instance)? ... all right, go on, you can move towards infinity as much as you want, and indeed create the longest sentence ever... though only when you reach an end (a sentence has to be complete to be a sentence) you will have a sentence unit, a single one. More important, at that point, even if you got a result bigger than 1.63585694x1023086, it would be finite again.

So, sentences are more similar to the lexicon than I previously thought. As far as I understand, the set of sentences in any human language is finite, boundless as the vocabulary of a language, and very huge, though much smaller than the maxima given above. Following the more manageable example for the Mabinogi corpus, using it as a rough extrapolation, it would be a matter of adding rules to come down to our solar system and begin to get a number at least smaller than the volume of our physical universe.
Subject: philology - Published 05-05-2014 15:45
Permanent link to this article
Is human language a system made of an infinite number of expressions? (I)
result calculator

Last week I had to give a conference about my research.

This was the introduction:

?Human language can be viewed as a system made of a finite number of units from where an infinite number of expressions is generated. Given any particular language, we have a finite and small number of phonemes, the distinctive sound units, that can be combined to form thousands of morphemes, with no upper bound to create new entries in the lexicon. Words themselves are combined to make up an infinite number of sentences.

When we learn a new language, or when we start to learn to speak, we are expected to learn the words that are more widely used first. In corpus linguistics we refer to this statistical property of words as word frequency. Our model focuses on a practical use of words based on their frequency. Words that have a higher frequency, that is, more occurrences in a given corpus we choose to represent a particular and real use of a language, appear in a higher number of sentences, hence they would be more useful to understand more sentences in a given language.

In order to grasp the meaning of a sentence, we also need to understand its syntax, the special relations that bind words together. Our model uses a very basic approach to syntax. We attempt to approximately determine which sentences would involve more processing in order to be efficiently parsed using words as units, sentence length the quantitative variable. We hence understand a sentence as a discrete variable made up of a number of words (tokens). We consider the number of words to be relevant for sentence complexity, defined here as the higher or lower number of syntactic rules we would need to efficiently parse a sentence. Long sentences involve more syntactic relations, so they are more difficult to parse than the shorter ones.?

On the process of writing the presentation I began to work on an easy explanation on how from an initial finite number of phonemes, a countable number of syllables is formed to make up a finite, though boundless, number of words in a language. From there, I wanted to show that the number of sentences was infinite, so I made some basic operations taking maxima as values:

* Leaving obsolete words apart, the total vocabulary in a dictionary is considered relevant for combinatory purposes. Let's take Oxford dictionaries as an example: 171,476 words in current use + around 9,500 derivatives: 180,976 words.

* Only lexemes are considered. Inflectional word-types and derivatives not listed in the dictionary are not counted.

* No grammatical restrictions are applied for the combination of words in a sentence.

* A sentence with an exceptional high number of words in English language is taken as the maximum value for the number of words in a sentence: Molly's monologue is 4,391 words (James Joyce).

The picture above shows the result of the final operation.
Subject: philology - Published 28-04-2014 12:20
Permanent link to this article
Seasonable season's greetings

Seasonable season, the snowy winter,
And long-lasting ice in Ulaanbaatar.
Subject: philology - Published 30-12-2013 09:55
Permanent link to this article
Pavements (XII)

Colours and sizes combined near the road to the mountains.
Subject: construction - Published 11-11-2013 14:26
Permanent link to this article
Pavements (XI)

Coloured tiles drawing patterns.
Subject: construction - Published 11-11-2013 14:21
Permanent link to this article
Pavements (X)

Different materials combined.
Subject: construction - Published 11-11-2013 14:17
Permanent link to this article
Pavements (IX)

Mosaic type.
Subject: construction - Published 11-11-2013 14:14
Permanent link to this article
© by Abertal

Warning: Unknown: Your script possibly relies on a session side-effect which existed until PHP 4.2.3. Please be advised that the session extension does not consider global variables as a source of data, unless register_globals is enabled. You can disable this functionality and this warning by setting session.bug_compat_42 or session.bug_compat_warn to off, respectively in Unknown on line 0