Search in directory encoding exception
Submitted by Thursday, 26 April, 2012 - 16:00
on
I try to use jedit to make recursive text search in all files in a directory.
When I perform a search, jedit complains about cp1252 encoding not applicable on pdf files. Yes, I'm on windows xp...
The file could not be loaded correctly (some data might be lost) with
the encoding "Cp1252".
(java.nio.charset.UnmappableCharacterException: Input length = 1)
Try selecting a different encoding.
It can be selected with the menu File->Reload with Encoding.
If you want it to be done automatically, add the candidates into
"List of fallback encodings" in Encodings pane of Global Options.
I don't understand why jedit tries to search in pdf files as pdf files are binary files. Does it try to extract text from pdf files and then apply an encoding to find text?
It seems that jedit can skip binary files, it seems to be an interesting feature, but I don't know how jedit considers files as binary. If I uncheck "skip binary files", the same exception pops out but for bin files, obj files, etc. I've been looking at the documentation, but cannot find anything about binary files association. I think this behavior is expected, but not very user friendly.
Is it possible to use jedit to search in all binary files as well?
Does it just try to decode the file using its default encoding (cp1252, I'm on windows xp) and considers the file as binary if it fails? (I guess it's not that simple)
thanks,
yohann
When I perform a search, jedit complains about cp1252 encoding not applicable on pdf files. Yes, I'm on windows xp...
The file could not be loaded correctly (some data might be lost) with
the encoding "Cp1252".
(java.nio.charset.UnmappableCharacterException: Input length = 1)
Try selecting a different encoding.
It can be selected with the menu File->Reload with Encoding.
If you want it to be done automatically, add the candidates into
"List of fallback encodings" in Encodings pane of Global Options.
I don't understand why jedit tries to search in pdf files as pdf files are binary files. Does it try to extract text from pdf files and then apply an encoding to find text?
It seems that jedit can skip binary files, it seems to be an interesting feature, but I don't know how jedit considers files as binary. If I uncheck "skip binary files", the same exception pops out but for bin files, obj files, etc. I've been looking at the documentation, but cannot find anything about binary files association. I think this behavior is expected, but not very user friendly.
Is it possible to use jedit to search in all binary files as well?
Does it just try to decode the file using its default encoding (cp1252, I'm on windows xp) and considers the file as binary if it fails? (I guess it's not that simple)
thanks,
yohann