@H_301_0@SAX解析 @H_301_0@在使用 DOM 解析 XML 文档时,需要读取整个 XML 文档,在内存中构架代表整个 DOM 树的Doucment对象,从而再对XML文档进行操作。此种情况下,如果 XML 文档特别大,就会消耗计算机的大量内存,并且容易导致内存溢出。 @H_301_0@SAX解析允许在读取文档的时候,即对文档进行处理,而不必等到整个文档装载完才对文档进行操作。 @H_301_0@SAX采用事件处理的方式解析XML文件,利用 SAX 解析 XML 文档,涉及两个部分:解析器和事件处理器: @H_301_0@解析器可以使用JAXP的API创建,创建出SAX解析器后,就可以指定解析器去解析某个XML文档。 @H_301_0@解析器采用SAX方式在解析某个XML文档时,它只要解析到XML文档的一个组成部分,都会去调用事件处理器的一个方法,解析器在调用事件处理器的方法时,会把当前解析到的xml文件内容作为方法的参数传递给事件处理器。 @H_301_0@事件处理器由程序员编写,程序员通过事件处理器中方法的参数,就可以很轻松地得到sax解析器解析到的数据,从而可以决定如何对数据进行处理。 @H_301_0@SAX方式解析XML文档 @H_301_0@1) 使用SAXParserFactory创建SAX解析工厂 @H_301_0@SAXParserFactory spf = SAXParserFactory.newInstance(); @H_301_0@2) 通过SAX解析工厂得到解析器对象 @H_301_0@SAXParser sp = spf.newSAXParser(); @H_301_0@3) 通过解析器对象得到一个XML的读取器 @H_301_0@XMLReader xmlReader = sp.getXMLReader(); @H_301_0@4) 设置读取器的事件处理器 @H_301_0@xmlReader.setContentHandler(new BookParserHandler()); @H_301_0@5) 解析xml文件 @H_301_0@xmlReader.parse("book.xml"); @H_301_0@DOM4J解析XML文档 @H_301_0@Dom4j是一个简单、灵活的开放源代码的库。Dom4j是由早期开发JDOM的人分离出来而后独立开发的。与JDOM不同的是,dom4j使用接口和抽象基类,虽然Dom4j的API相对要复杂一些,但它提供了比JDOM更好的灵活性。 @H_301_0@Dom4j是一个非常优秀的Java XML API,具有性能优异、功能强大和极易使用的特点。现在很多软件采用的Dom4j,例如Hibernate,包括sun公司自己的JAXM也用了Dom4j。 @H_301_0@使用Dom4j开发,需下载dom4j相应的jar文件。 @H_301_0@DOM4j中,获得Document对象的方式有三种: @H_301_0@1.读取XML文件,获得document对象 @H_301_0@SAXReader reader = new SAXReader();
- import java.util.ArrayList;
- import java.util.List;
- import javax.xml.parsers.DocumentBuilder;
- import javax.xml.parsers.DocumentBuilderFactory;
- import javax.xml.transform.Transformer;
- import javax.xml.transform.TransformerFactory;
- import javax.xml.transform.dom.DOMSource;
- import javax.xml.transform.stream.StreamResult;
- import org.w3c.dom.Document;
- import org.w3c.dom.Element;
- import org.w3c.dom.Node;
- import org.w3c.dom.NodeList;
- import com.itheima.domain.Book;
- public class DOMUtil {
- public static List<Book> getBooks(String uri) throws Exception {
- List<Book> books = new ArrayList<Book>();
- // 1. 通过DocumentBuilderFactory 创建一个工厂类
- DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
- // 2. 通过工厂的newDocumnentBuilder方法获取一个DocumentBuilder
- DocumentBuilder builder = factory.newDocumentBuilder();
- // 3. 调用parse方法获取Document对象
- Document document = builder.parse(uri);
- NodeList bookNodes = document.getElementsByTagName("书");
- Book book = null;
- for (int i = 0; i < bookNodes.getLength(); i++) {
- Element bookEle = (Element) bookNodes.item(i);
- book = new Book();
- String id = bookEle.getAttribute("id");
- book.setId(id);
- String publisher = bookEle.getAttribute("出版社");
- book.setPublisher(publisher);
- String bookName = bookEle.getElementsByTagName("书名").item(0)
- .getTextContent();
- book.setBookName(bookName);
- String author = bookEle.getElementsByTagName("作者").item(0)
- .getTextContent();
- book.setAuthor(author);
- String price = bookEle.getElementsByTagName("售价").item(0)
- .getTextContent();
- book.setPrice(price);
- books.add(book);
- book = null;
- }
- return books;
- }
- public static void addBook(Book book,String uri) throws Exception {
- Document document = DocumentBuilderFactory.newInstance()
- .newDocumentBuilder().parse(uri);
- Node bookshelfNode = document.getElementsByTagName("书架").item(0);
- Element bookEle = document.createElement("书");
- bookEle.setAttribute("id",book.getId());
- bookEle.setAttribute("出版社",book.getPublisher());
- bookshelfNode.appendChild(bookEle);
- Node bookNameNode = bookEle.appendChild(document.createElement("书名"));
- bookNameNode.setTextContent(book.getBookName());
- Node authorNode = bookEle.appendChild(document.createElement("作者"));
- authorNode.setTextContent(book.getAuthor());
- Node priceNode = bookEle.appendChild(document.createElement("售价"));
- priceNode.setTextContent(book.getPrice());
- TransformerFactory factory = TransformerFactory.newInstance();
- Transformer transformer = factory.newTransformer();
- transformer.transform(new DOMSource(document),new StreamResult(uri));
- }
- public static void updateBook(Book book,String uri) throws Exception {
- Document document = DocumentBuilderFactory.newInstance()
- .newDocumentBuilder().parse(uri);
- NodeList bookNodes = document.getElementsByTagName("书");
- for (int i = 0; i < bookNodes.getLength(); i++) {
- Element bookEle = (Element) bookNodes.item(i);
- if (bookEle.getAttribute("id").equals(book.getId())) {
- bookEle.setAttribute("出版社",book.getPublisher());
- bookEle.getElementsByTagName("书名").item(0)
- .setTextContent(book.getBookName());
- bookEle.getElementsByTagName("作者").item(0)
- .setTextContent(book.getAuthor());
- bookEle.getElementsByTagName("售价").item(0)
- .setTextContent(book.getPrice());
- }
- }
- TransformerFactory.newInstance().newTransformer()
- .transform(new DOMSource(document),new StreamResult(uri));
- }
- public static void deleteBook(String id,String uri) throws Exception {
- Document document = DocumentBuilderFactory.newInstance()
- .newDocumentBuilder().parse(uri);
- NodeList bookNodes = document.getElementsByTagName("书");
- for (int i = 0; i < bookNodes.getLength(); i++) {
- Element bookEle = (Element) bookNodes.item(i);
- if (bookEle.getAttribute("id").equals(id)) {
- bookEle.getParentNode().removeChild(bookEle);
- }
- }
- TransformerFactory.newInstance().newTransformer()
- .transform(new DOMSource(document),new StreamResult(uri));
- }
- }
Documentdocument = reader.read(new File("input.xml")); @H_301_0@2.解析XML形式的文本,得到document对象. @H_301_0@ String text = "<members></members>";
Document document = DocumentHelper.parseText(text); @H_301_0@3.主动创建document对象. @H_301_0@ Document document = DocumentHelper.createDocument();
//创建根节点 @H_301_0@ Element root = document.addElement("members"); @H_301_0@SAX解析代码实例
@H_301_0@pull 解析器 @H_301_0@pull 解析器是一个第三方的开源api,其解析原理与sax 解析原理很相像,都是采用事件驱动的方式. @H_301_0@不同点: pull 解析器在每次读取到一段数据之后,需要程序员手动的调用其next() 方法,将当前解析到的这一行的"指针"移到下一行. @H_301_0@http://www.xmlpull.org @H_301_0@http://kxml.sourceforge.net/kxml2/ @H_301_0@在目前的android 平台中解析xml 文件都是采用pull解析器,是谷歌力推的xml解析器 @H_301_0@pull 解析器是一个开源的java项目,既可以用于android,也可以用于JavaEE。 @H_301_0@在android源码根目录的libcore目录下存放的是pull 解析器相关的所有类库. @H_301_0@pull 解析代码实例
- import java.util.ArrayList;
- import java.util.List;
- import javax.xml.parsers.SAXParser;
- import javax.xml.parsers.SAXParserFactory;
- import org.xml.sax.Attributes;
- import org.xml.sax.SAXException;
- import org.xml.sax.XMLReader;
- import org.xml.sax.helpers.DefaultHandler;
- import com.itheima.domain.Book;
- public class SAXUtil {
- private static Book book = null;
- private static List<Book> books = new ArrayList<Book>();
- public static List<Book> getBooks(String uri) throws Exception {
- // 首先 获得 一个 工厂 对象
- SAXParserFactory factory = SAXParserFactory.newInstance();
- // 通过工厂对象 整出 一个 解析器对象
- SAXParser parser = factory.newSAXParser();
- // 拿到 一个 xml reader 对象.
- XMLReader xmlReader = parser.getXMLReader();
- // 提前设置 好 事件 处理器
- xmlReader.setContentHandler(new DefaultHandler() {
- String temp = null;
- @Override
- public void startElement(String uri,String localName,String qName,Attributes attributes) throws SAXException {
- super.startElement(uri,localName,qName,attributes);
- if ("书".equals(qName)) {
- book = new Book();
- book.setId(attributes.getValue("id"));
- book.setPublisher(attributes.getValue("出版社"));
- }
- else if ("书名".equals(qName))
- temp = "书名";
- else if ("作者".equals(qName))
- temp = "作者";
- else if ("售价".equals(qName))
- temp = "售价";
- }
- @Override
- public void characters(char[] ch,int start,int length)
- throws SAXException {
- super.characters(ch,start,length);
- if ("书名".equals(temp))
- book.setBookName(new String(ch,length));
- else if ("作者".equals(temp))
- book.setAuthor(new String(ch,length));
- else if ("售价".equals(temp))
- book.setPrice(new String(ch,length));
- }
- @Override
- public void endElement(String uri,String qName)
- throws SAXException {
- super.endElement(uri,qName);
- if ("书".equals(qName)) {
- books.add(book);
- book = null;
- }
- temp = null;
- }
- });
- // 解析xml文件.
- xmlReader.parse(uri);
- return books;
- }
- }
@H_301_0@
- import java.io.FileInputStream;
- import java.io.FileOutputStream;
- import java.util.ArrayList;
- import java.util.List;
- import org.xmlpull.v1.XmlPullParser;
- import org.xmlpull.v1.XmlPullParserFactory;
- import org.xmlpull.v1.XmlSerializer;
- import com.itheima.domain.Book;
- public class PullUtil {
- public static List<Book> getBooks(String uri) throws Exception {
- // 通过 xmlpull parser 工厂整出 一个 工厂 对象.
- XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
- // 拿到 一个解析器 对象 .
- XmlPullParser parser = factory.newPullParser();
- // 设置要解析的 xml文件 .
- parser.setInput(new FileInputStream(uri),"UTF-8");
- // 获得 eventtype (事件 类型 )
- int eventType = parser.getEventType();
- // 启动 一个 循环 去 一行行 读取 xml文件.
- Book book = null;
- List<Book> books = new ArrayList<Book>();
- while (eventType != XmlPullParser.END_DOCUMENT) {
- switch (eventType) {
- case XmlPullParser.START_TAG:
- if ("书".equals(parser.getName())) {
- book = new Book();
- for (int i = 0; i < parser.getAttributeCount(); i++) {
- if ("id".equals(parser.getAttributeName(i)))
- book.setId(parser.getAttributeValue(i));
- if ("出版社".equals(parser.getAttributeName(i)))
- book.setPublisher(parser.getAttributeValue(i));
- }
- }
- if ("书名".equals(parser.getName()))
- book.setBookName(parser.nextText());
- if ("作者".equals(parser.getName()))
- book.setAuthor(parser.nextText());
- if ("售价".equals(parser.getName()))
- book.setPrice(parser.nextText());
- break;
- case XmlPullParser.END_TAG:
- if ("书".equals(parser.getName())) {
- books.add(book);
- }
- break;
- default:
- break;
- }
- eventType = parser.next();
- }
- return books;
- }
- public static void addBooks(List<Book> books,String uri) throws Exception {
- XmlSerializer serializer = XmlPullParserFactory.newInstance()
- .newSerializer();
- FileOutputStream fileOutputStream = new FileOutputStream(uri);
- serializer.setOutput(fileOutputStream,"UTF-8");
- serializer.startDocument("UTF-8",true);
- serializer.startTag(null,"书架");
- for (Book book : books) {
- serializer.startTag(null,"书");
- serializer.attribute(null,"id",book.getId());
- serializer.attribute(null,"出版社",book.getPublisher());
- serializer.startTag(null,"书名");
- serializer.text(book.getBookName());
- serializer.endTag(null,"书名");
- serializer.startTag(null,"作者");
- serializer.text(book.getAuthor());
- serializer.endTag(null,"作者");
- serializer.startTag(null,"售价");
- serializer.text(book.getPrice());
- serializer.endTag(null,"售价");
- serializer.endTag(null,"书");
- }
- serializer.endTag(null,"书架");
- serializer.endDocument();
- fileOutputStream.flush();
- fileOutputStream.close();
- }
- }
DOM4J是dom4j.org出品的一个开源XML解析包。Dom4j是一个易用的、开源的库,用于XML,XPath和XSLT。它应用于Java平台,采用了Java集合框架并完全支持DOM,SAX和JAXP。 @H_301_0@dom4j 代码实例
- package com.itheima.util;
- import java.io.FileInputStream;
- import java.io.FileOutputStream;
- import java.util.ArrayList;
- import java.util.Iterator;
- import java.util.List;
- import org.dom4j.Document;
- import org.dom4j.Element;
- import org.dom4j.io.OutputFormat;
- import org.dom4j.io.SAXReader;
- import org.dom4j.io.XMLWriter;
- import com.itheima.domain.Book;
- public class Dom4jUtil {
- public static List<Book> getBooks(String uri) throws Exception {
- List<Book> books = new ArrayList<Book>();
- Book book = null;
- SAXReader reader = new SAXReader();
- Document document = reader.read(new FileInputStream(uri));
- Element rootEle = document.getRootElement();
- for (Iterator<Element> i = rootEle.elementIterator("书"); i.hasNext();) {
- Element bookEle = i.next();
- book = new Book();
- book.setId(bookEle.attributeValue("id"));
- book.setPublisher(bookEle.attributeValue("出版社"));
- book.setBookName(bookEle.elementText("书名"));
- book.setAuthor(bookEle.elementText("作者"));
- book.setPrice(bookEle.elementText("售价"));
- books.add(book);
- book = null;
- }
- return books;
- }
- public static void addBook(Book book,String uri) throws Exception {
- SAXReader reader = new SAXReader();
- Document document = reader.read(new FileInputStream(uri));
- Element rootEle = document.getRootElement();
- Element bookEle = rootEle.addElement("书")
- .addAttribute("id",book.getId())
- .addAttribute("出版社",book.getPublisher());
- bookEle.addElement("书名").addText(book.getBookName());
- bookEle.addElement("作者").addText(book.getAuthor());
- bookEle.addElement("售价").addText(book.getPrice());
- OutputFormat format = OutputFormat.createPrettyPrint();
- format.setEncoding("UTF-8");
- XMLWriter writer = new XMLWriter(new FileOutputStream(uri),format);
- writer.write(document);
- writer.close();
- }
- public static void deleteBook(String id,String uri) throws Exception {
- SAXReader reader = new SAXReader();
- Document document = reader.read(new FileInputStream(uri));
- Element rootEle = document.getRootElement();
- for (Iterator<Element> i = rootEle.elementIterator("书"); i.hasNext();) {
- Element bookEle = i.next();
- if (bookEle.attributeValue("id").equals(id))
- rootEle.remove(bookEle);
- }
- OutputFormat format = OutputFormat.createPrettyPrint();
- format.setEncoding("UTF-8");
- XMLWriter writer = new XMLWriter(new FileOutputStream(uri),format);
- writer.write(document);
- writer.close();
- }
- public static void updateBook(Book book,String uri) throws Exception {
- SAXReader reader = new SAXReader();
- Document document = reader.read(new FileInputStream(uri));
- Element rootEle = document.getRootElement();
- for (Iterator<Element> i = rootEle.elementIterator("书"); i.hasNext();) {
- Element bookEle = i.next();
- if (bookEle.attributeValue("id").equals(book.getId())) {
- bookEle.attribute("出版社").setValue(book.getPublisher());
- bookEle.element("书名").setText(book.getBookName());
- bookEle.element("作者").setText(book.getAuthor());
- bookEle.element("售价").setText(book.getPrice());
- }
- OutputFormat format = OutputFormat.createPrettyPrint();
- format.setEncoding("UTF-8");
- XMLWriter writer = new XMLWriter(new FileOutputStream(uri),format);
- writer.write(document);
- writer.close();
- }
- }
- }