autodetect encoding for html file

Submitted by neoedmund on Tuesday, 15 November, 2005 - 09:27

in fact i have made one, if you think useful you can add this feature to new jedit versions. it find text like "content="text/html; charset=xxxxxx"" at the begin of the html.

[code]
BufferIORequest.java
    /**
     * Tries to detect if the stream is gzipped, and if it has an encoding
     * specified with an XML PI.
     */
    private Reader autodetect(InputStream in) throws IOException {
        in = new BufferedInputStream(in);

        String encoding = buffer.getStringProperty(Buffer.ENCODING);
        if (!in.markSupported())
            Log.log(Log.WARNING, this, "Mark not supported: " + in);
        else if (buffer.getBooleanProperty(Buffer.ENCODING_AUTODETECT)) {
            
            {// neoe add: detect html's encoding
                String enc = getHtmlEncoding(in);
                if (enc != null && MiscUtilities.isSupportedEncoding(enc)) {
                    buffer.setProperty(Buffer.ENCODING, enc);
                    return new InputStreamReader(in, enc);
                }
            }
	....
	(original lines)

/**add by neoedmund*/	
private String getHtmlEncoding(InputStream in) throws IOException {
        String enc = null;
        String key = "charset=";        
        int bufSize=1000;
        byte[] buf = new byte[bufSize];
        in.mark(bufSize);
        int len;
        if  ((len = in.read(buf,0,bufSize)) >0) {
            String line=new String(buf,0,len);    
            int p1 = line.indexOf(key);
            if (p1 >= 0) {
                int p2 = p1 + key.length();
                p1 += key.length();
                if (line.charAt(p1) == '\'' || line.charAt(p1) == '"') {
                    p1++;
                }
                while (p2 < line.length()
                        && "'\" >;,.".indexOf(line.charAt(p2)) < 0) {
                    p2++;
                }
                if (p2 <= line.length()) {
                    enc = line.substring(p1, p2);
                }
            }
        }
        in.reset();        
        return enc;
    }
[/code]

by hy263 on Mon, 18/09/2006 - 23:57

« June 2025
Mo	Tu	We	Th	Fr	Sa	Su
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

file	ver	dls
GdbPlugin for jEdit 4.5+	0.5	1163
Hypersearch results analysis	1.0	2248
German Language Pack for jEdit 5 (up-to-date)	5.3	4157
Goal column macros	1.0	4047
Hyper-search all .txt files in home dir	1	3303
Select line	1.0	3460
Open_Copied_Path.bsh	1.0	8518
Select_All_or_Lines.bsh	1.0	3428
A BeanShell macro script to search and open a recent file or a file in the current directory.	1.0	5653
Select contents in between parentheses (excluding parentheses)	1.0	3558

file	ver	dls
German Localization light	4.4.2.1	108256
Context Free Art (*.cfdg)	0.31	46074
BBEdit scheme	1.0	18610
JBuilder scheme	.001	18511
ColdFusion scheme	1.0	18044
R Edit Mode - extensive version	0.1	17491
Advanced HTML edit mode	1.0	16226
Matlab Edit Mode	1.0	16089
jEdit XP icons	1.0	15248
XP icons for jEdit	1.1	14312

RSS

XML

HTML