home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Black Box 4
/
BlackBox.cdr
/
editors
/
jdic14.arj
/
EDICTJ.DOC
< prev
next >
Wrap
Text File
|
1991-10-18
|
8KB
|
202 lines
E D I C T J
===========
Public Domain Japanese/English Dictionary file, coordinated by
Jim Breen.
CURRENT VERSION
---------------
The version date is now included in the dictionary under the
entry "aaaa" in katakana. (This keeps it as the first entry when
it is sorted.)
The master copy of EDICTJ is in the pub/Nihongo directory of
monu6.cc.monash.edu.au. There are other copies around, but they
may not be as up-to-date.
INTRODUCTION
------------
EDICTJ is an attempt to produce a public domain Japanese/English
Dictionary in machine-readable form. It was intended initially
for use with MOKE (Mark's Own Kanji Editor) and related software
such as JDIC, however it has the potential to be used in a large
number of packages.
At present it is in the "public domain", however consideration
is being given to placing it under Gnu or Copyleft protection,
mainly to prevent the work of its many contributors being
exploited by commercial software developers.
FORMAT
------
EDICTJ is in the "EDICT" format used by MOKE, which is based on
the structure of the hankan files in the Wnn project. It uses
EUC coding for kana and kanji, however this can be converted to
JIS or SJIS by any of the several conversion programs around. It
consists of an ascii file with one entry per line. The format of
entries is:
KANJI [KANA] /english_1/english_2/.../
or
KANA /english_1/.../
CONTENTS
--------
EDICTJ consists of:
(a) the basic EDICT distributed with MOKE 2.0. This was
compiled by MOKE's author, Mark Edwards, with assistance from
Spencer Green. Mark has very kindly released this material to
the public domain as part of EDICTJ. A number of corrections
have been made to the MOKE original, e.g. spelling mistakes,
minor mistranslations, etc. It also had a lot of duplications,
which have been removed. It contained about 1900 unique
entries.
(b) additions by Jim Breen. Apart from a number of additions
made during normal MOKE usage (e.g. using it to read fj.* news)
I laboriously keyed in a ~2000 entry dictionary used in my first
year nihongo course years ago. I have then worked through other
vocabulary lists and dictionaries trying to make sure major
entries were not omitted. [All this was terrific revision.] This
task is continuing, although it has slowed down, and I suspect I
will run out of energy well before EDICTJ reaches the 6000 entry
mark.
(c) additions by others. Many people have contributed entries
and corrections to EDICTJ. A full list is at the back of this
file.
At 5000+ entries, EDICTJ is nowhere as big as a good commercial
dictionary, which typically has 20,000+ entries with examples,
etc. It is, however, bigger than some of the smaller
dictionaries, and when used in conjunction with a
search-and-display program like JDIC it provides an effective
on-line dictionary service.
COPYRIGHT?
----------
A word on copyright. Of course most of the material in EDICTJ
came from other published lists. Dictionary copyright is a
difficult point, because clearly the first lexicographer who
published "inu means dog" could not claim a copyright violation
over all subsequent Japanese dictionaries. What makes each
dictionary unique (and copyrightable) is the particular
selection of words, the phrasing of the meanings, the
presentation of the contents (a very important point in the case
of EDICTJ), and the means of publication. The advice I have
received from people who know about these things is that EDICTJ
is just as much a new dictionary as any others on the market.
Readers may see an entry which looks familiar, and say "Aha!
That comes from the XYZ Jiten!". They may be right, and they may
be wrong. After all there aren't too many translations of neko.
Let me make one thing quite clear. NONE of this dictionary came
from commercial machine-readable dictionaries. I have a case of
RSI in my right elbow to prove it.
LEXOGRAPHICAL DETAILS
---------------------
EDICTJ is actually a Japanese->English dictionary, although the
words within it can be selected in either language using
appropriate software. (JDIC uses it to provide both E->J and
J->E functionality.)
The limitations on size inherent in the dictionary due to its
current usage (MOKE scans it sequentially and JDIC needs to hold
it in RAM) has meant that examples of usage cannot be included,
and inclusion of phrases is very limited.
No inflections of verbs or adjectives have been included, except
in idiomatic expressions. Similarly particles are handled as
separate entries. Adverbs formed from adjectives (-ku or ni) are
not included. Verbs are, of course, are in the plain or
"dictionary" form.
In working on EDICTJ, bearing in mind I want to use it in MOKE
and with JDIC, I have had to come up with a solution to the
problem of adjectival nouns [keiyoudoushi] (e.g. kirei and
kantan) and verbs formed by adding suru (e.g. benkyousuru). If I
put entries in edict with the "na" and "suru" included, MOKE
will not find a match when they are omitted or, the case of
suru, inflected. What I have decided to do is to put the basic
noun into the dictionary and add "(vs)" where it can be used to
form a verb with suru, and "(an)" if it is an adjectival noun.
Entries appear as:
KANJI [benkyou] /study (vs)/
KANJI [kantan] /simple (an)/
Where necessary, verbs are marked with "(vi)" or "(vt)"
according to whether they are intransitive or transitive. (Work
on this aspect is continuing.) I have also used (id) to mark
idiomatic expressions, (col) for colloquialisms, (pol) for
teineigo, etc.
USAGE
-----
EDICTJ can be used as the dictionary within MOKE simply by
renaming it "EDICT". If you are a MOKE user and have been adding
to your EDICT using the "Ask English?" option, you may wish to
append your additions. Why not send them to me and I will add
them to EDICTJ?
EDICTJ can be used, with acknowledgement, for any purpose
whatever, EXCEPT for inclusion in new commercial products. Mark
Edwards can, of course, use it in later MOKE releases. Stephen
Chung may also be using it in his PD "JWP".
CONTRIBUTIONS
-------------
I will be delighted if people send me corrections, suggestions,
and ESPECIALLY additions. Before ripping in with a lot of
suggestions, make sure you have the latest version, as others
may have already made the same comments.
The preferred format for submissions is a JIS or EUC file
(uuencoded for safety) containing replacement/new entries.
Separate the amendments from the new material: e.g.
**Amendments to EDICTJ yyyymmmdd**
old entry1
new entry1
old entry2
........
**New Entries**
New entry1
New entry2
.........
I prefer not to get a "diff" or "patch" file as the master
edictj may have had quite a few changes since you got your copy.
ACKNOWLEDGEMENTS
----------------
Mark Edwards, Spencer Green, Alina Skoutarides, Takako Machida,
Theresa Martin, Satoshi Tadokoro, Stephen Chung, Hidekazu
Tozaki, Clifford Olling.
Jim Breen
(jwb@monu6.cc.monash.edu.au)
Department of Robotics & Digital Technology
Monash University
Caulfield East 3145
AUSTRALIA
18 October 1991 (approx 5440 entries, 192kbytes)