04 Oct 2004 (permalink)

News in argp module

(did I say I am maintaining it? Well, so I am, since the day before yesterday)

I've fixed a nasty coredump that occurred if the option name of an OPTION_DOC was set to NULL. Besides, updated argp will never attempt to translate an empty help string. And finally, new option OPTION_NO_TRANS prohibits translating name field of an OPTION_DOC option. The latter is already used by tar.

07 Oct 2004 (permalink)

Dictionary structure

I have started to rewrite dictionaries in Ellinika project. The idea is to use XML instead of the definition language I have been using so far. The latter was designed to be simple, short (I hate to type) and suitable for describing dictionary entries. Its parser is written in C and is therefore fairly fast. Presently, the dictionary created with it contains 1150 entries, and the process of its creation has confirmed that the dictionary structure is right and input language generally suitable for the purpose. However, it has also exposed some drawbacks of the language.

The principal drawback is that currently a dictionary entry is supposed to contain only one part of speech 1) , i.e. currently assumed entry structure is:

 (key part-of-speech articles)
However, there exist words that pertain to several parts of speech simultaneously, and that change their meaning according to the part of speech. For example, κρυώνω, when used as a transitive verb, means to refrigerate, whereas being used as an intransitive verb it means to feel cold, to freeze. The number of such words (verbs in particular) in Greek is fairly large.

So, I have decided to redesign the input language, but instead of simply fixing the already existing language, I've chosen to fully write the dictionary sources in XML.

External representation

In the new definition language each entry is represented as follows:

<NODE>
 <K>string</K>+
 [<F>string</F>]
 <P ID="string">
  <M>string</M>+
  <A>string</A>*
  <X>string</X>*
  <T ID="string" />*
  <X>string</X>*
 </P>+
 <X>string</X>*
</NODE>

(as usual, optional elements are inclosed in brackets, * means zero or more occurrences of the element, and + means one or more occurrences of the element).

Elements have the following meaning:

NODE
Start the definition of a dictionary entry
K
Introduces the dictionary key, i.e. the word of the source language that is explained by this entry. There may be several keys if the notion in question has several sinonyms.
F
Introduces grammatic forms of the key, whenever these are not formed by standard rules. In future I expect to write proper verb conjugator, then this field will probably mark a reference to or invocation of it.
P
Part of speech and meanings associated with it (see below). Attribute ID introduces the name (usually abbreviated) of the part of speech.
M
Translation of the word (M stands for Meaning)
A
Antonym
X
Cross-reference for this entry. Usually this is a reference to sinonym or some semantically related key.
T
Topic or group this entry pertains to. ID identifies the topic. When many entries pertain to the same topic, their definitions can be enclosed in
<T ID="name">
...
</T>
construct.

There are two special forms of this notation. One is useful as a shortcut for those words that have only one part of speech (as I said I hate to type, so I'm trying to spare as much typing as possible):

<NODE>
 <K>string</K>+
 [<F>string</F>]
 <P>string</P>
 <M>string</M>*
 <A>string</A>*
 <X>string</X>*
 <T ID="string" />*
</NODE>

Another one introduces an entry that is a reference to another entry in the dictionary:

<NODE>
 <K>string</K>+
 [<F>string</F>]
 <P>string</P>
 <X>string</X>*
 <T ID="string" />*
</NODE>

This is useful for such pairs as "ο ποταμός" and "το ποτάμι", both meaning "river" but having different genders. The special form <X /> means reference to the immediately preceeding node definition wich has at least one translation (M element).

Examples

A working example can be found here.

Internal representation

The initial version of the dictionary translator is already available. See its heading comment for the short description of the internal representation (Scheme of course).

1) The term part of speech is used here in broader meaning: e.g. for the dictionary purposes transitive verbs and intransitive verbs are regarded as different parts of speech. I'll possibly have to introduce finer granularity here (e.g. part-of-speech/subpart or something similar), but currently I am not sure it will be worth the effort.

08 Oct 2004 (permalink)
Do not seek death; death will find you.
But seek the road which makes death a fulfillment.
-- Dag Hammarskjold
10 Oct 2004 (permalink)
Nowe zdjęcia: z Kijowa i z Odessy.
11 Oct 2004 (permalink)
I have merged guile-sql and guile-gettext into a single project: gamma.
12 Oct 2004 (permalink)

Dictionary support is ready

Dictionary of Ellinika site has been completely rewritten in XML (see this entry for more info). The project now uses gamma instead of guile-sql + quile-texinfo.

17 Oct 2004 (permalink)
Zgodnie z propozycją Wojtka dodałem do galerii przyciski przejścia wstecz i naprzód. W pliku html (sic!) wygląda to tak:
<hr>

<center>
[
<?guile
(let ((env (current-image-neighbors)))
 (if (car env)
  (begin
    (display "<a href=\"")
    (display (image-url (gallery-id) (car env)))
    (display "\">") 
    (display "Poprzedne zdjęcie")
    (display "</a> | ")))
 
 (if (cdr env)
  (begin
    (display "<a href=\"")
    (display (image-url (gallery-id) (cdr env)))
    (display "\">") 
    (display "Następne zdjęcie")
    (display "</a> | "))))
?>
Już jasne do czego zmierzam, co? No właśnie, chcę zmodyfikowac mod_guile tak, by Apache rozumial <?guile ... ?> na równi z <?php ... ?> :))