java中的list是可以包含重复元素的(hash code 和equals),那么对list进行去重操作有两种方式实现:
方案一:可以通过hashset来实现,代码如下:
class student {
private string id;
private string name;
public student(string id, string name) {
super();
this.id = id;
this.name = name;
}
@override
public string tostring() {
return "student [id=" + id + ", name=" + name + "]";
}
@override
public int hashcode() {
final int prime = 31;
int result = 1;
result = prime * result + ((id == null) ? 0 : id.hashcode());
result = prime * result + ((name == null) ? 0 : name.hashcode());
return result;
}
@override
public boolean equals(object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getclass() != obj.getclass()) {
return false;
}
student other = (student) obj;
if (id == null) {
if (other.id != null) {
return false;
}
} else if (!id.equals(other.id)) {
return false;
}
if (name == null) {
if (other.name != null) {
return false;
}
} else if (!name.equals(other.name)) {
return false;
}
return true;
}
}
必须实现hashcode和equals两个方法,一会我们会看为啥必须实现
具体的操作代码如下:
private static void removelistduplicateobject() {
list<student> list = new arraylist<student>();
for (int i = 0; i < 10; i++) {
student student = new student("id", "name");
list.add(student);
}
system.out.println(arrays.tostring(list.toarray()));
set<student> set = new hashset<student>();
set.addall(list);
system.out.println(arrays.tostring(set.toarray()));
list.removeall(list);
set.removeall(set);
system.out.println(arrays.tostring(list.toarray()));
system.out.println(arrays.tostring(set.toarray()));
}
调用代码:
public static void main(string[] args) {
removelistduplicateobject();
}
利用hashset进行去重操作,为啥必须覆盖hashcode和equals两个方法呢?
我们查看hashset的add操作源码如下:
public boolean add(e e) {
return map.put(e, present)==null;
}
调用了hashmap进行操作的,我们看hashmap的put操作:
public v put(k key, v value) {
if (key == null)
return putfornullkey(value);
int hash = hash(key.hashcode());
int i = indexfor(hash, table.length);
for (entry<k,v> e = table[i]; e != null; e = e.next) {
object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
v oldvalue = e.value;
e.value = value;
e.recordaccess(this);
return oldvalue;
}
}
modcount++;
addentry(hash, key, value, i);
return null;
}
需要注意的是:
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
......
}
也就是说hash code相等且equals(==)。
复杂度:一边遍历即可,o(n)
方案二:直接遍历一遍list进行通过contains和add操作实现
代码如下:
private static void removelistduplicateobjectbylist() {
list<student> list = new arraylist<student>();
for (int i = 0; i < 10; i++) {
student student = new student("id", "name");
list.add(student);
}
system.out.println(arrays.tostring(list.toarray()));
list<student> listuniq = new arraylist<student>();
for (student student : list) {
if (!listuniq.contains(student)) {
listuniq.add(student);
}
}
system.out.println(arrays.tostring(listuniq.toarray()));
list.removeall(list);
listuniq.removeall(listuniq);
system.out.println(arrays.tostring(list.toarray()));
system.out.println(arrays.tostring(listuniq.toarray()));
}
其他等同上面。
复杂度:
一边遍历,同时调用了contains方法,我们查看源码如下:
public boolean contains(object o) {
return indexof(o) >= 0;
}
public int indexof(object o) {
if (o == null) {
for (int i = 0; i < size; i++)
if (elementdata[i]==null)
return i;
} else {
for (int i = 0; i < size; i++)
if (o.equals(elementdata[i]))
return i;
}
return -1;
}
可以看到又对新的list做了一次遍历操作。也就是1+2+....+n这样复杂度为o(n*n)
结论:
方案一效率高,即采用hashset的方式进行去重操作
更多java list去重操作实现方式。