home *** CD-ROM | disk | FTP | other *** search
- Eng2Jpn - 97-002
- ----------------
-
- WHAT IS IT?
- -----------
-
- Eng2Jpn is a conversion/modification of Professor Jim Breen's EDICT_S
- dictionary file for Scott Powell's "Dictionary" program which is
- available for the US Robotics Palm Pilot PDA.
-
-
-
- The original EDICT_S and master EDICT file are available from the main
- web site:
-
- ftp://ftp.cc.monash.edu.au/pub/nihongo/
-
- or any of the mirror sites:
-
- ftp://ftp.cdrom.com/pub/japanese/monash/ US - California
- ftp://kuso.shef.ac.uk/pub/japanese/monash/ UK
- ftp://enterprise.ic.gc.ca/pub/nihongo/ Canada
- ftp://ftp.sedl.org/pub/mirrors/nihongo/ US(Texas)
- ftp://ftp.uwtc.washington.edu/pub/Japanese/Monash/ US(Washington)
- ftp://ftp.xmission.com/pub/users/s/snowhare/nihongo/monash/ US(Utah)
- ftp://ftp.u-aizu.ac.jp/pub/SciEng/nihongo/ftp.cc.monash.edu.au/Japan
- ftp://ftp.funet.fi/pub/culture/japan/mirrors/monash/ Finland
- ftp://ftp.uni-duisburg.de
-
-
-
- Scott Powell's "Dictionary" software is available as time-limited
- shareware from:
-
- http://www.kagi.com/scottpowell/
-
- **The dictionary converter program is also available from this
- address.
-
-
-
-
- THE FILES
- ---------
-
- Eng2Jpn.pdb - The actual dictionary file for the Pilot
- Eng2Jpn.vox - The original unconverted dictionary in text format
- Eng2Jpn.txt - This readme file
-
-
- WHY?
- ----
-
- Well, the original E2J dictionary which is distributed with the
- "Dictionary" program for the pilot has a number of problems.
- First-off, is the size. At around 1,600 words, it's a pretty small
- dictionary, and doesn't cover a lot of what might be needed for those
- studying Japanese, or wanting to look up words which might be a little
- more obscure.
-
- Secondly, the dictionary uses a rather odd format for some of the
- romanizations of the Japanese language. For example, in the E2J
- dictionary, FATHER is translated as "titi". A better translation is
- actually "chichi".
-
-
- SOURCE
- ------
- The EDICT master dictionary file was/is compiled by Professor Jim
- Breen of the Monash University in Australia. Currently at version
- 97-004, it weights in at around 61,000 words, it's one of the largest,
- and most-complete English/Japanese dictionaries available, and is
- highly recommended for people interested in the Japanese language.
-
- Unfortunately, at 61,000 words, and approximately 2.7mb, it's a little
- large to fit into the Pilot. For this reason, I used the EDICT_S
- dictionary which lands at around 10,600 words.
-
- The original EDICT file is actually a combination of high-code
- Japanese characters (Kanji/Katakana/Hiragana), which is encoded in one
- of the standard 2-byte character sets such as S-JIS, etc.
-
- The EDICT_S file which this Pilot dictionary has been created from has
- been romanized to remove the high-characters which aren't compatible
- with the "Dictionary" software.
-
- Here's the original documentation file from the EDICT_S dictionary.
-
- EDICT_S
- -------
-
- The EDICT_S file consists of a selection
- of entries from the V97-002 edition of
- the EDICT file. These entries have had
- all fields containing kanji removed,
- and all the readings have been
- converted to Hepburn romaji. Note that
- the romaji is in fact "wa-puro-"
- romaji, i.e. it differs slightly from
- the usual Hepburn form as follows:
-
- See the "edict_r.doc" file for
- further information.
-
- The selection of entries in the
- EDICT_S file has been made simply
- by only including those entries
- found in the smaller "JDDICT" file.
-
- jwb@dgs.monash.edu.au
- April/May 1997
-
-
- Some further modifications were necessary in order to fit the
- dictionary into a reasonable amount of space in the Pilot, and to keep
- it from blowing up the conversion program. These changes were:
-
- -Sorted alphabetically
- -Truncated to FIRST meaning of a given word where multiple meanings
- were present
- -Multiples were erased (where a word appears multiple times in
- Japanese, with different english translations)
- -Japanese words longer than 9 characters in length were removed.
- -And some additional (bracketed) information on words was removed.
-
- These modifications had the net effect of trimming the dictionary's
- size from 10,600 words (362k), down to a more manageable 8,100 words
- (145k) - allowing it to fit into the Pilot, and still be compatible
- with the "Dictionary" software.
-
-
- Version Notes:
-
- 1.0 - Initial release
- ---
- Dictionary at 7,417 words
-
-
- 1.1 - Update release
- ---
- With a new (more robust) conversion program from the author of the
- "Dictionary" software, I was able to expand the dictionary slightly,
- cutting words larger than 9 letters in length. It looks like the
- "Dictionary" program probly maxes out the the index entries at around
- 16,384 entries. My release uses 16,291.
-
- Cutting it rather close ;)
-
- Dictionary now at 8116 words (9.5% bigger)
-
- Also included is the output "longwords" file, listing what didn't make
- it into the file due to word-length.
-
-
-
- David S. Griffiths
- dgriff@direct.ca
-