jEdit Community - Resources for users of the jEdit Text Editor
creating a user language dictionary in jEdit
Submitted by pat4kin on Tuesday, 23 November, 2010 - 10:43
Hallo!
I have been in search of a text editor for work on a desktop running MS-Win-XP and a notebook running Linux Ubuntu.
Having installed jEdit on the XP system I am impressed with the functionality.
I am building a translation model to translate my English texts into German and need the ability to run my dictionary on the jEdit platform. The dictionary is based on a csv file structure.
Can anyone recommend me a way to get a dictionary running with jEdit?
I would like to make use of the create text tags function on jEdit.
I would then use the same tags with the dictionary entries.
Would BeanShell be a way?
What would, for a non-programmer, be the best way to create a simple dictionary look-up script?
For any good ideas I would be very grateful.
Thanks and regards, pat4kin.
Comment viewing options
Select your preferred way to display the comments and click 'Save settings' to activate your changes.
How to translate
by patchworker on Tue, 23/11/2010 - 19:44
Hello pat4kin,

how to you like to do the translation? Can you give an example of a text and how you need the translation-process, so that you can translate fast?

Without your answer it's not possible to tell what's the best method to do it.

Greets!
Daniel
 
Your question on, "How I translate?"
by pat4kin on Wed, 24/11/2010 - 16:35
Hallo!
I have built a model using jEdit tags, which describes both an English sentence and its German word order equivalent.
The sentence is segmented into phrases and the dictionary supports the translation of the words in a phrase.
Because English has no scheme for defining the roles of words, I have built a tagging scheme which does just that.
The dictionary entries use the same tags.
So as an example the subject phrase "The little boy" in the sentence "The little boy has played football all afternoon", is tagged with "word roles".
The dictionary translates the phrase into "Der kleine Junge".
Does that suffice to answer your question? I have not given you more detail because I will not provide it before publication.
Regards, pat4kin.
 
?
by Robert Schwenn on Wed, 24/11/2010 - 20:16
You're talking about "run my dictionary" and "translation model". What is that all? Is Your "dictionary" a standalone program which can be run outside jedit? If yes, how is it invoked, how does it get the input and where is the output produced? If no, what else is it?

That kind are the questions of interest.

Robert
 
your post on 24.11 and my reply on 27.11
by pat4kin on Mon, 29/11/2010 - 12:53
Hallo Robert Schwenn!
Did my response answer your questions?
If not what additionally do you need from me?
I intend to publish a first version as an exe eBook by the end of 2010.
This version will address the needs of those at the starting point of translation and with an English vocabulary of less that 1.000 words.
There will come additional versions in the first qaurter of 2011.
The next version will address English and German language users with vocabularies in each languge in excess of 2.000 to 3.000 words.
I will extend the eBooks set with a version for translating faulty English constructions into more correct English.
Since what I am doing is unusal, I'd be happy top have contact with any members experienced on this topic area.
Regards, pat4kin.
 
Options
by Robert Schwenn on Mon, 29/11/2010 - 18:54
First, I have no experience on this. I just wanted to help You find an approach from a jEdit point of view. Since I'm not really sure how You want to see Your dictionary working inside jEdit, here are some general options:

1. External program
1a) You can manually run an external program in the Console's plugin dockable and optionally redirect the output of that program into a new jEdit buffer (see jEdit Help -> Plugins -> Console -> Chapter 3).

1b) This could be streamlined with a "Commando" (see jEdit Help -> Plugins -> Console -> Chapter 6).

1c) If 1b) weren't good enough, You could customize invoking the external program and processing it's output via a Beanshell macro (see jEdit Help -> Plugins -> Console -> Chapter 8 ).


If You want to bypass Your external program and instead implement the whole functionality inside jEdit, there are two options:

2. Beanshell macro
If it were sufficient to sometimes invoke some kind of action which doesn't has to run permanent in the background, a Beanshell macro is perfect. Of course this means to program in Java (see http://www.jedit.org/users-guide/writing-macros-part.html ). jEdit comes bundled with some macros, which are good examples. Also, there are contributed macros: http://community.jedit.org/?q=filestore/browse/25

3. If You want jEdit to listen in the background for what's doing the user (maybe for real time translating), You definitely need to develop a plugin. But before You begin You should look what other people have done already. There are existing spelling plugins (see http://plugins.jedit.org/ or Plugin Manager).

Robert
 
Options - Rober Schwenn
by pat4kin on Tue, 07/12/2010 - 11:21
Hallo Robert!
I have decided to do the following:
1. Publish as a PDF eBook in December as a R.1.
It will use neither the translation dictionary nor the word role tags.
2. Spend the next two months trying to understand how to publish R.2.
For this release I will probably use HTML.exe and have a dictionary embedded.
Let me make a comment on BeanShell. It is almost impossible to decipher if one is not a programmer.
I will get back when I am a little nearer my goal. Regards, pat4kin.
 
Your very comprehensive comments on my posted needs.
by pat4kin on Wed, 01/12/2010 - 16:32
Hallo Robert.
I thank you for a very broad reponse.
I will go through it, deliberate on how to reply and then get back to you. Thanks again. Regards, pat4kin.
 
my dictionary
by pat4kin on Sat, 27/11/2010 - 07:53
Hallo!
In reply to the questions you asked.
1. Is the dictionary a separate product?
1.1. Today it is a csv file.
1.2. I can load it into and run it is a user dictionary in LingoPad.
2. How does it relate to the translation model?
2.1. Today I can call it up from the bottom of the screen and use it to translate each word individually?
2.2. I can gather the translated phrases in a csv file and add them to the dictionary periodically.
3. Where does the dictionary come from?
3.1. I built it myself since no dictionary on the market can do what I needed to be able to do.
Ideally I want to be able to translate a phrase using the tags enlosing the phrase since the tags apply ro each word in the phrase.
I thus need four elements.
1. A dictionary file to carry my dictionary and
2. A look-up program customisable to translate each word in turn in the phrase automatically.
3. The ability to add it to jEdit as a user dictionary.
4. A way to call the dictionary from an eBook, that i will release. It is likely to be a HTML.exe eBook.
Is that explanataion adequate?
Regards, pat4kin.
 
XML and XSL is usable
by patchworker on Sun, 05/12/2010 - 21:24
Hello pat4kin,

did you think of the idea to use XML and XSL for it?

You can write an XSL what does the work - or ask somebody else to write it.

In an XSL you can load two or more data-sources. One can be your dictionary file with a structure like (this should be real xml):

dictionary
__phrase
____It's raining cats and dogs
__/phrase
__translations
____german
______Es ist ein schlimmes Unwetter
____/german
__/translations
/dictionary

If you want to have a VERY simple textfile and don't like to write it as docbook.xml you can do it like this:

[very-long-text]
**It's Raining Cats & Dogs##

We have all heard the expression "**it's raining cats and dogs##." There are several theories about this rainfall saying. It is possible that the word cat is derived from the Greek word 'catadupe' meaning 'waterfall.' Or it could be raining 'cata doxas,' which is Latin for 'contrary to experience,' or an unusual fall of rain.
[/very-long-text]

So you need not to put the tags in all time, when you write the text and just use ** as start of the idiom and ## as the end.

Now you can use beanshell to replace the stuff:
** -> [idiom]
## -> [/idiom]

So now you have a text what can be used as source for an XSL transformation. It can look up all the idiom tags and search for a fitting translation.

So you just needs to find someone who's familiar with XSLT 1.0 or (better) XSLT 2.0

Greets!
Daniel
 
My deliberations and my routes
by pat4kin on Fri, 17/12/2010 - 11:41
Hallo Daniel and other contributors! Here is the post I promised on the subject of building an English sentence, segmentation scheme, integrated with an English to German dictionary for translation. I can markup at each level of a hierarchical structure from each sentence type to clauses, then to phrases and words. Each level has tag pairs and seems XML conform. I am able to restructure to meet German phrase and word order grammar requirements. As an example a simple English sentence of 16 words has typically - Two Clause tag sets, Main clause and predicate - The main clause has 3 tag sets, Subject, Verb and object and - The predicate has three phrase tage sets, prep phrase time, prep phrase place and prep phrase possessive. The re-sequencing for the German word order is - Subject - Verb main - Prep phrase time - Prep phrase place - Prep phrase posessive - Direct Object - Verb past participle - Verb infiinitive. The main differences are that the German scheme does not use the concept of clauses and has eight phrase or word elements. Using POS tags is also possible producing 16 single word tags. Finally the process can be extended by marking the head-words of phrases which use them. In both languages it is five markups. The concept of parsing to realize POS tagging is an unnecessary task. Furthermore there is no way to parse in English and in the German word order. The one element which assists the translation is one which is not covered in the above list of 'methods'. It is a way to give each of the English words a 'word-role' to correspond to its German equivalent. I have developed such a scheme. I am able to dispense with the segmentation process because I am able to visualise the phrase segments without mark-up. Thus an English to German dictionary entry for a word in a phrase, presented as SGML could be: aaaa bbbb cccc The German word carries no word-role tag, because the role is part of the German word. The tasks to realise such a markup and dictionary scheme are, quite honestly time-consuming rather than complex; if one has developed the concept. Because I am going to publish the translation model as an eBook, I will use both schemes in two models. The first with XML full markup and the second with a scheme to use only the word-role tags. The steps before in terms of segmenting and markup I will probably not use in the second model. Since your last post I have looked at a number of products that could be used. I have come to the conclusion that what I need is a product providing both the XML definition and validation and also the XSLT style sheet cabaility to do the translation process of a string of words in a sentence marked up with their role tags. Now if you are able to provde the XSLT step at a price I can live with, we should talk. If I have simple translations of phrases or single sentences and the XCSLT tool is not too difficult to understand then maybe I can do it myself. What tool combination would you recommend? If you wish to have an example based on the 16 word sentence IU referred to at the outset, I will send it. I am reluctant to give away detailed infromation of the use of word-role tags until I publish it. In a co-operation to realise it perhaps I would react differently. Regards and thanks for further interest, pat4kin.
 
Mailing List
by Robert Schwenn on Mon, 27/12/2010 - 12:09
Hi Patrick,

got Your mail. Again: I have no experience on computer aided translation. I'm using http://translate.google.com/ Eye-wink Seriously, I've almost no clue what You are talking about. And I cannot tell You more than I have already done. I guess You have to develop some code.

If You need help doing so, try to ask a *concrete* question to jedit-users@lists.sourceforge.net or jedit-devel@lists.sourceforge.net. Keep in mind, that there are jEdit users and Java developers, not dictionary gurus Smiling

Good Luck
 
XML and XSL is usable
by pat4kin on Tue, 07/12/2010 - 12:01
Hallo Daniel!

Did I understand the implications of what you said, or am I thinking wishfully.

A text document fully markedup with XML, can, with an XSLT Stylesheet and the dictionary, also created with XML / XSLT, be translated into German.

It might not be necessary to translate the whole document in sch a way. However all of the phrases marked as idioms or something such as "nested declensions" could be an optimal way of beginning. The concept could well fit where "word-for-word" translations do not fit.

You also made a reference to docbook, where does it fit in to the spectrum?

Could I ask you to suggest the sites, tools and documentation it would be advisable to begin with.
Where does the tagging in jEdit fit into this spectrum?

What did you refer to in the cryptic sentence?
"Now you can use beanshell to replace the stuff:
** ->[Idiom]
## ->[/Idiom]

That could relate to something I have either not seen or not understood.

Thanks and regards, pat4kin.
 
Simple text with just one tag: [idiom]the idiom[/idiom]
by patchworker on Tue, 07/12/2010 - 16:19
Hi,

the tools you could use:

1. a jEdit macro what replaces "**" to the xml-start-tag [idiom] and replaces "##" to the xml-end-tag [/idiom].

So you're much faster when you write the texts.

2. jEdit to write the XSL what can transform the source-xml-file plus the dictionary-xml-file to the resulting file.

3. jEdit XSL-Plugin and a jEdit macro to run that transformation

Greets!
Daniel

**It's Raining Cats & Dogs##

We have all heard the expression "**it's raining cats and dogs##." There are several theories about this rainfall saying. It is possible that the word cat is derived from the Greek word 'catadupe' meaning 'waterfall.' Or it could be raining 'cata doxas,' which is Latin for 'contrary to experience,' or an unusual fall of rain.
[/very-long-text]

So you need not to put the tags in all time, when you write the text and just use ** as start of the idiom and ## as the end.

Now you can use beanshell to replace the stuff:
** -> [idiom]
## -> [/idiom]
 
The dictionary using XML / XSL
by pat4kin on Tue, 07/12/2010 - 11:11
Hallo Daniel!
In some way I did "almost" think of it.
I say that because I have built the text markup on a set of self designed tags, more akin to SGML than to XML.
I did not think of using the same for a dictionary.
Neither did I want to go over the fence into real XML territory.
Let me in a simple document describe what I have done and we could then discuss your suggestion.
Within two or three days I will send it as an attached zip file.
Thanks and regards, pat4kin
Publishing material as eBook
by pat4kin on Tue, 23/11/2010 - 10:50
Hallo again today.
Following my first post today, I would like to ask, in this post, for advice on how to create an eBook from a PDF file and an accompanied csv file.
Is using a PDF file the best way to publish or is there a better way to publish combining the two elements I referred to above?
I need to be able to publish with assurance that my material will not be illegaly copied.
Any useful advice from others who have gone down such a path would be most welcome. Kind regards, pat4kin.
 
PDF for eBook-readers
by patchworker on Tue, 23/11/2010 - 19:53
Hi again,

PDF files are much better than epub files IMHO, because people can read and print it with every computer. On my iPad I like to read PDF files, I wouldn't like to use ipub files if that would be provided by an ipad-application.

If you really have pdf files as source - like I understood your question - then you just can use Adobe Acrobat to do that and have to care about the licence of the author of that pdf.

Is it so, that you want to create PDF files from documents written by yourself? (datasource like OpenOffice/LibreOffice or docbook)

Greets!
Daniel
 
pdf for eBook readers
by pat4kin on Wed, 24/11/2010 - 16:44
I asked the question because I am not certain hwo to deal with what I wish to do. I will publish two or more eBooks on the subject of using a translation model that I have built.
The documents will be written with jEdit, which I find to be an ideal product. Each eBook offered will be in both English and German based on using the model. It will not be interactive if it is a pdf but can be if it is a real exe eBook.
In an event I want to offer the user a translation dictionary with all of the words and phrases in the eBook text.
Just how to do this was the core of my posting.
If you need more information to answer my call for help, then I will give you what I can without endengering my copyright before I publish.
Does that answer your question?
Refards, pat4kin.
User login
Browse archives
« August 2024  
MoTuWeThFrSaSu
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
Poll
Are you interested in language packs for jEdit?
Yes, and I could help maintain translations
26%
Yes, I'd like to have translations
32%
Indifferent
35%
No, that'd be bad (please comment)
7%
Total votes: 1093
Syndication
file   ver   dls
German Localization light   4.4.2.1   94298
Context Free Art (*.cfdg)   0.31   46060
BBEdit scheme   1.0   18599
JBuilder scheme   .001   18500
ColdFusion scheme   1.0   18029
R Edit Mode - extensive version   0.1   17478
Advanced HTML edit mode   1.0   16211
Matlab Edit Mode   1.0   16073
jEdit XP icons   1.0   15234
XP icons for jEdit   1.1   14298