Accurate expression generator

I think it’s crazy that you can actually generate accurate expressions with a computer.

Gödel was more right than Chomsky, I don’t know what Chomsky thinks of Gödel’s theorem. In my opinion the point is that no matter how sophisticated the rules of generation are, they will always be incomplete.

The question would be more or less this:
You have an entity extractor, you process a text in English, and you get a list of the entities that appear in the text and their frequency.
These entities have been disambiguated. Then you can establish an identical list of those same concepts in another language (for example in Spanish). But instead of solving the problem of how to translate by generating a text in Spanish with some rules of construction of expressions, we would use the Web as a huge library of “correct” expressions.

This means that we would have a list of entities in Spanish and that we would make a Query with that list and some marking that establishes criteria of “similarity” with texts written in Spanish that are already on the Internet.

For example
the discovered set of entities , let this be
a, b, c, …
is identical to a set of entities
A, B, C…
in the language you want to translate the text into
Then the set
A, B, C…
It would be ordered by the entities that compose them in the order in which they appear and by those same variables ordered by the frequency in which they appear. The percentage occupied by these entities in the whole of the text from which they come would be the basis for establishing the rules for the construction of correct expressions of the document thus translated (or generated). Rather than rules, we would be talking about the “similarity index” of a series of sentences obtained in Web searches.

Can it be translated from one language to another? Can you, for example, correct a sentence written in bad English, identifying its grammatical errors?
I understand that after an Internet search using this criterion of similarity we would have a set of results and we could establish which of them are more “common” in the language into which we want to do the translation.

Can there be some structural homology between elements of our “semantic memory” and concepts that are unknown to us? Could we learn by identifying these homologies and permuting the nodes of these structures, using some mathematical formulation? Could we make a kind of Latent Semantic Analysis (LSA) between what we know and spheres of knowledge that we do not know?

2018 Note

I must say that when I wrote this I did not know the experiments of Robert Mercer and Elon Musk in their companies, in terms of machine translation and generation of accurate sentences.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *