作者Magicx (270度的鸟顾之相)
看板AndroidDev
标题[问题] 请问big5编码
时间Mon Jan 16 00:32:36 2012
想请教一下大家在处理BIG5编码的网页时都是如 何处理呢?
因为EntityUtils 似乎没有支援BIG5..所以小弟的程 式是这样处理的(如下)
可是发现再遇到一些特殊字元如 " > 的时候都会 变的怪怪的~
感谢指教!!
public static String GET2getWEB(String linkString)
{
byte[] byteResponse = null;//FOR BIG 5
String strRet = null;
try {
HttpGet GET2LINK = new HttpGet (linkString);
/* 取得HTTP response */
HttpResponse httpResponse = new DefaultHttpClient() .execute(GET2LINK);
HttpEntity entity = httpResponse.getEntity();
/* 取出回应字串 *///
strRet = EntityUtils.toString(entity,HTTP.UTF_8); //for big5 编码
if (entity != null && HttpStatus.SC_OK == httpResponse.getStatusLine().getStatusCode())
{
//strRet = EntityUtils.toString(entity,HTTP.ISO_8859_1);
//strRet = EntityUtils.toString(entity);
byteResponse = EntityUtils.toByteArray(entity);
strRet = EncodingUtils.getString(byteResponse, "big5");
}
} catch (Exception e) { e.printStackTrace(); }
return strRet; }//end of post2getWEB
--
posted from android bbs reader on my HTC Sensation XE with Beats Audio Z715e
https://market.android.com/details?id=com.bbs.reader
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 42.72.204.134
※ 编辑: Magicx 来自: 114.34.226.28 (01/16 00:50)
1F:→ tericky:为什麽回应字串要转编码两次??? 01/16 21:48
2F:→ tericky:既然已经知道source是big5,要转成UTF-8,直接 01/16 21:49
3F:→ tericky:strRet = new String( 01/16 21:51
4F:→ tericky: EntityUtils.toString(response.getEntity()) 01/16 21:52
5F:→ tericky: .getBytes("ISO-8859-1"), "UTF-8") 01/16 21:52
6F:→ tericky:至於>、<、&这种特殊符号,除了tag以外,都要自己转换格式 01/16 21:54