jEdit Community - Resources for users of the jEdit Text Editor
ISO-10646-UCS-2 encoded xml-file isn't readable
Submitted by achim.wessling on Sunday, 29 June, 2008 - 13:55
I've a ISO-10646-UCS-2 encoded xml-file which is written by Smallworld GIS. If I try to open this file with jedit it's not readable. Cryptic characters are displayed, other editors like gedit, PSPad, windows texteditor, do not have any problem with displaying it. What's wrong?
Comment viewing options
Select your preferred way to display the comments and click 'Save settings' to activate your changes.
changing encoding to ISO-8859-1 helps
by achim.wessling on Sun, 29/06/2008 - 14:07
If I change the encoding with another editor to ISO-8859-1 I can load the file without any problems?

Did the other emntioned editors ignore the encoding setting?
 
Global Options
by Robert Schwenn on Sun, 29/06/2008 - 20:11
Did You check the encoding, that jedit used to open the file (in status bar)?

AFAIK jedit doesn't try to auto-detect the encoding of a file. Look at GlobalOptions -> Encodings. There is set a default encoding that jedit uses to open the file. If there occurs an error, jedit tries the encodings of the "List of fallback encodings".

Robert
 
Re: Global Options
by achim.wessling on Mon, 30/06/2008 - 06:37
The 'List of fallback encodings' is set to: ISO-8859-1
If jEdit uses its autodection the encoding is set to 'ISO-10646-UCS-2' and the displayed file content looks like this: 㰿硭氠癥牳楯渽∱⸰∠敮捯摩湧㴢䥓伭㄰㘴㘭啃匭㈢⁳瑡湤慬潮攽≹敳∿㸍਼獥物慬彸浬彴桩湧⁦潲浡琽∱∾ഊ़步祥搠捬慳猽≳眺灲潰敲瑹彬...
If I set the encoding before loading the file to 'ISO-8859-1' and switch of the 'auto-detect' the file is corectly displayed!
The header of the file looks like this:
If I change it to this with another editor, jEdit opens it with auto-edction on without any problems.
 
wrong xml header
by Robert Schwenn on Mon, 30/06/2008 - 18:14
Hi,
I was wrong. jEdit does try to auto-detect the encoding of a file when opening. It can auto-detect the following cases:
1. UTF-16 and UTF-8Y
2. encoding of a xml file if specified in the header

Because of the second point jEdit opens Your file with 'ISO-10646-UCS-2'. But I guess that this is the wrong encoding (and so is the xml header) - otherwise it should has be shown correct.

Maybe that other editors ignore the xml header. Are You able to check this? They should display the used encoding in status line or anywhere else.

Another poit to keep in mind (from jedit help):
"If a file is opened without an explicit encoding specified and it appears in the recent file list, jEdit will use the encoding last used when working with that file; otherwise the default encoding will be used."

Robert
User login
Browse archives
« March 2024  
MoTuWeThFrSaSu
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Poll
Are you interested in language packs for jEdit?
Yes, and I could help maintain translations
26%
Yes, I'd like to have translations
32%
Indifferent
35%
No, that'd be bad (please comment)
7%
Total votes: 1093
Syndication
file   ver   dls
German Localization light   4.4.2.1   82338
Context Free Art (*.cfdg)   0.31   46043
JBuilder scheme   .001   18487
BBEdit scheme   1.0   18108
ColdFusion scheme   1.0   18015
R Edit Mode - extensive version   0.1   17461
Advanced HTML edit mode   1.0   16194
Matlab Edit Mode   1.0   16057
jEdit XP icons   1.0   15220
XP icons for jEdit   1.1   14282