以下实例演示了如何使用 net.url 类的 url() 构造函数来抓取网页:
/*
author by w3cschool.cc
main.java
*/import java.io.bufferedreader;import java.io.bufferedwriter;import java.io.filewriter;import java.io.inputstreamreader;import java.net.url;public class main {
public static void main(string[] args)
throws exception {
url url = new url("http://www.w3cschool.cc");
bufferedreader reader = new bufferedreader
(new inputstreamreader(url.openstream()));
bufferedwriter writer = new bufferedwriter
(new filewriter("data.html"));
string line;
while ((line = reader.readline()) != null) {
system.out.println(line);
writer.write(line);
writer.newline();
}
reader.close();
writer.close();
}}
以上代码运行输出结果为(网页的源代码,存储在当前目录下的 data.html 文件中):
<!doctype html> <html> <head> <meta charset="utf-8"/>
<meta http-equiv="x-ua-compatible" content="ie=11,ie=10,ie=9,ie=8"/>……
以上就是java 实例 - 网页抓取的内容。