The French version of the CeDict dictionary. Programs, json, xml, csv generated files.

Eric Streit 938c8142e4 écriture des programmes, génération des fichiers et readme 3 weeks ago
Liste-mots-formatés 938c8142e4 écriture des programmes, génération des fichiers et readme 3 weeks ago
Programmes 938c8142e4 écriture des programmes, génération des fichiers et readme 3 weeks ago
.gitignore a121236f04 premier commit 2 months ago
A-FAIRE.md a121236f04 premier commit 2 months ago
README.md 938c8142e4 écriture des programmes, génération des fichiers et readme 3 weeks ago

README.md

CFDICT

This is the French counterpart of the Cedcit Chinese-English dictionary.

The dictionary is delivered in an xml format.

I tranformed the original file in a new xml, csv and json formated files, with my own and similar format across my projects.

I first tranform the original into a json format with the same structure as the original with a command line utility. I then applied my programs to get the formats I wanted.

Original XML format:

<word>
    <id>30702</id>
    <upd>0</upd>
    <trad>〇</trad>
    <simp>〇</simp>
    <py>ling2</py>
    <trans>
        <fr><![CDATA[0]]></fr>
        <fr><![CDATA[zéro]]></fr>
    </trans>
</word>

Corresponding json:

{
        "id": [
          "30702"
        ],
        "upd": [
          "0"
        ],
        "trad": [
          "〇"
        ],
        "simp": [
          "〇"
        ],
        "py": [
          "ling2"
        ],
        "trans": [
          {
            "fr": [
              "0",
              "zéro"
            ]
          }
        ]
  }

After passing threw my programs:

  • CSV: one line

    㱩 殰 dú avortement / mort-né CFdict

  • xml: one record

      <hanzi>〇</hanzi>
      <traditional>〇</traditional>
      <pinyin>líng</pinyin>
      <translations>0</translations>
      <translations>zéro</translations>
      <origin>CFdict</origin>
    </ligne>
    

  • json: the beginning and the first record

    "CFdict": [

      {
        "hanzi": "〇",
        "traditional": "〇",
        "pinyin": "líng",
        "translations": [
          "0",
          "zéro"
        ],
        "origin": "CFdict"
      }, etc ....