久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

在不更改 XML 的情況下用 Java 解析包含 HTML 實體的

Parsing XML file containing HTML entities in Java without changing the XML(在不更改 XML 的情況下用 Java 解析包含 HTML 實體的 XML 文件)
本文介紹了在不更改 XML 的情況下用 Java 解析包含 HTML 實體的 XML 文件的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

問題描述

I have to parse a bunch of XML files in Java that sometimes -- and invalidly -- contain HTML entities such as —, > and so forth. I understand the correct way of dealing with this is to add suitable entity declarations to the XML file before parsing. However, I can't do that as I have no control over those XML files.

Is there some kind of callback I can override that is invoked whenever the Java XML parser encounters such an entity? I haven't been able to find one in the API.

I'd like to use:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

DocumentBuilder parser = dbf.newDocumentBuilder();
Document        doc    = parser.parse( stream );

I found that I can override resolveEntity in org.xml.sax.helpers.DefaultHandler, but how do I use this with the higher-level API?

Here's a full example:

public class Main {
    public static void main( String [] args ) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder parser = dbf.newDocumentBuilder();
        Document        doc    = parser.parse( new FileInputStream( "test.xml" ));
    }

}

with test.xml:

<?xml version="1.0" encoding="UTF-8"?>
<foo>
    <bar>Some&nbsp;text &mdash; invalid!</bar>
</foo>

Produces:

[Fatal Error] :3:20: The entity "nbsp" was referenced, but not declared.
Exception in thread "main" org.xml.sax.SAXParseException; lineNumber: 3; columnNumber: 20; The entity "nbsp" was referenced, but not declared.

Update: I have been poking around in the JDK source code with a debugger, and boy, what an amount of spaghetti. I have no idea what the design is there, or whether there is one. Just how many layers of an onion can one layer on top of each other?

They key class seems to be com.sun.org.apache.xerces.internal.impl.XMLEntityManager, but I cannot find any code that either lets me add stuff into it before it gets used, or that attempts to resolve entities without going through that class.

解決方案

I would use a library like Jsoup for this purpose. I tested the following below and it works. I don't know if this helps. It can be located here: http://jsoup.org/download

public static void main(String args[]){


    String html = "<?xml version="1.0" encoding="UTF-8"?><foo>" + 
                  "<bar>Some&nbsp;text &mdash; invalid!</bar></foo>";
    Document doc = Jsoup.parse(html, "", Parser.xmlParser());

    for (Element e : doc.select("bar")) {
        System.out.println(e);
    }   


}

Result:

<bar>
 Some&nbsp;text — invalid!
</bar>

Loading from a file can be found here:

http://jsoup.org/cookbook/input/load-document-from-file

這篇關于在不更改 XML 的情況下用 Java 解析包含 HTML 實體的 XML 文件的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

相關文檔推薦

Upload progress listener not fired (Google drive API)(上傳進度偵聽器未觸發(Google 驅動器 API))
Save file in specific folder with Google Drive SDK(使用 Google Drive SDK 將文件保存在特定文件夾中)
Google Drive Android API - Invalid DriveId and Null ResourceId(Google Drive Android API - 無效的 DriveId 和 Null ResourceId)
Google drive api services account view uploaded files to google drive using java(谷歌驅動api服務賬戶查看上傳文件到谷歌驅動使用java)
Google Drive service account returns 403 usageLimits(Google Drive 服務帳號返回 403 usageLimits)
com.google.api.client.json.jackson.JacksonFactory; missing in Google Drive example(com.google.api.client.json.jackson.JacksonFactory;Google Drive 示例中缺少)
主站蜘蛛池模板: 日韩精品 电影一区 亚洲 | 久久ww| 91免费观看在线 | 日本精品免费在线观看 | 国产成人精品一区二 | 日韩欧美一区二区在线播放 | 久久久91精品国产一区二区三区 | 亚洲在线一区二区 | 欧美一区二区在线观看 | 日本视频免费 | 国产免费一区二区 | 羞羞视频免费观看 | 亚洲欧美综合 | 久久久久免费精品国产 | 久热国产精品视频 | 免费簧片视频 | 亚州综合一区 | 精品国产乱码久久久久久闺蜜 | 久久精品91久久久久久再现 | 91人人澡人人爽 | 在线亚洲欧美 | 欧美美女一区二区 | 久久精品亚洲国产奇米99 | 一级黄在线观看 | 国产中文在线观看 | 日韩av在线一区二区三区 | 雨宫琴音一区二区在线 | 欧美精品一区二区在线观看 | 羞羞网站在线观看 | 中文字幕亚洲欧美 | 欧美成人免费在线视频 | 另类一区 | 久久美女网 | 一区二区国产精品 | 国产精品久久久久久久久久免费看 | 成人福利网站 | 九九九久久国产免费 | 91色网站| 午夜精品久久久久久不卡欧美一级 | 国产在线一区二区 | 在线视频 亚洲 |