用Java创build内存泄漏
我刚刚接受了一次采访,并被要求用Java创build内存泄漏。 毋庸置疑,我觉得自己很笨,不知道如何开始创build一个。
什么是一个例子?
在纯Java中创build一个真正的内存泄漏(通过运行代码无法访问的对象,但仍然保存在内存中)的一个好方法是:
- 应用程序创build一个长时间运行的线程(或使用线程池来更快地泄漏)。
- 线程通过(可选的自定义)ClassLoader加载一个类。
- 该类分配一大块内存(例如
new byte[1000000]
),在静态字段中存储对它的强引用,然后将自身的引用存储在ThreadLocal中。 分配额外的内存是可选的(泄漏的类实例就足够了),但它会使泄漏工作更快。 - 线程清除对自定义类或从其加载的ClassLoader的所有引用。
- 重复。
这是有效的,因为ThreadLocal保持对该对象的引用,该对象保持对其Class的引用,该引用继而保持对其ClassLoader的引用。 ClassLoader反过来保持它加载的所有类的引用。
(在许多JVM实现中,特别是在Java 7之前,这种情况更糟糕,因为Classes和ClassLoader被直接分配到了permgen中,而且从来没有GC'd,但是无论JVM如何处理类的卸载,ThreadLocal仍然会阻止类对象被回收。)
这种模式的一个变种就是为什么应用程序容器(如Tomcat)可以像筛子一样泄漏内存,如果您经常以任何方式重新部署恰巧使用ThreadLocals的应用程序。 (由于应用程序容器使用所描述的线程,并且每次重新部署应用程序,都会使用新的ClassLoader。)
更新 :由于许多人不停地询问它,下面是一些示例代码,显示了这个行为 。
保留对象引用的静态字段[esp final field]
class MemorableClass { static final ArrayList list = new ArrayList(100); }
在冗长的String上调用String.intern()
String str=readString(); // read lengthy string any source db,textbox/jsp etc.. // This will place the string in memory pool from which you cant remove str.intern();
(未封闭)开放stream(文件,networking等)
try { BufferedReader br = new BufferedReader(new FileReader(inputFile)); ... ... } catch (Exception e) { e.printStacktrace(); }
未连接的连接
try { Connection conn = ConnectionFactory.getConnection(); ... ... } catch (Exception e) { e.printStacktrace(); }
JVM垃圾回收器无法访问的区域 ,例如通过本地方法分配的内存
在Web应用程序中,某些对象存储在应用程序范围中,直到应用程序被明确停止或删除。
getServletContext().setAttribute("SOME_MAP", map);
不正确或不合适的JVM选项 ,例如IBM JDK上的noclassgc
选项,可防止未使用的类垃圾回收
请参阅IBM jdk设置 。
一个简单的事情就是使用一个hashCode()
或equals()
不正确(或不存在)的HashSet,然后继续添加“duplicates”。 而不是忽略重复,因为它将只会增长,你将无法删除它们。
如果你想要这些不好的键/元素挂起来,你可以使用类似的静态字段
class BadKey { // no hashCode or equals(); public final String key; public BadKey(String key) { this.key = key; } } Map map = System.getProperties(); map.put(new BadKey("key"), "value"); // Memory leak even if your threads die.
下面将会有一个非常明显的例子,除了被遗忘的监听器,静态引用,hashmaps中的伪造/可修改密钥,或者只是线程被卡住而没有任何机会结束它们的生命周期的标准案例的Java泄漏。
-
File.deleteOnExit()
– 总是泄漏string,如果string是一个子string,泄漏就更糟(底层字符[]也泄漏)– 在Java 7子string中也复制了char[]
,所以后者不适用 ; 不过,丹尼尔不需要投票。
我将专注于线程,以显示大多数非pipe理线程的危险,不希望甚至碰摇摆。
-
Runtime.addShutdownHook
并不删除…然后即使与removeShutdownHook由于ThreadGroup类中的一个错误,关于未启动的线程它可能不会收集,有效地泄漏ThreadGroup。 JGroup在GossipRouter中有泄漏。 -
创build,但不是开始,一个
Thread
进入与上述相同的类别。 -
创build线程inheritance
ContextClassLoader
和AccessControlContext
,加上ThreadGroup
和任何InheritedThreadLocal
,所有这些引用都是潜在的泄漏,以及由类加载器和所有静态引用加载的整个类以及ja-ja。 整个jucExecutor框架具有超级简单的ThreadFactory
接口,其效果尤其明显,但大多数开发人员不知道潜在的危险。 也有很多图书馆根据要求启动线程(太多行业stream行的库)。 -
ThreadLocal
caching; 这些在很多情况下是邪恶的。 我确定每个人都看过相当多的基于ThreadLocal的简单caching,还有一个坏消息:如果线程继续超出预期的上下文ClassLoader的生命,那么这是一个很好的小漏洞。 除非确实需要,否则不要使用ThreadLocalcaching。 -
当ThreadGroup本身没有线程时调用
ThreadGroup.destroy()
,但仍保留子线程组。 一个不好的泄漏将阻止ThreadGroup从其父项中移除,但是所有的子项都变得不可枚举。 -
使用WeakHashMap和值(in)直接引用密钥。 这是一个很难find没有堆转储。 这适用于所有扩展的
Weak/SoftReference
,可能会保留一个硬引用回守卫的对象。 -
使用带有HTTP(S)协议的
java.net.URL
并从(!)加载资源。 这个是特别的,KeepAliveCache
在系统ThreadGroup中创build一个新的线程,它泄漏当前线程的上下文类加载器。 线程是在没有活动线程的情况下在第一个请求时创build的,所以你可能会很幸运或者只是泄漏。 Java 7中已经修复了这个漏洞,创build线程的代码正确地删除了上下文类加载器。 还有几个案例(像ImageFetcher, 也是固定的 )创build类似的线程。 -
使用
PNGImageDecoder
在构造函数(例如PNGImageDecoder
new java.util.zip.Inflater()
中传递new java.util.zip.Inflater()
,而不是调用PNGImageDecoder
的end()
。 那么,如果你传入的构造函数只是new
,没有机会…是的,调用streamclose()
不会closuresinflater,如果它作为构造函数参数手动传递。 这不是一个真正的泄漏,因为它会被终结者释放…当它认为是必要的。 直到那一刻,它严重地吃掉了本机的内存,它可能会导致Linux oom_killer不受惩罚地杀死这个进程。 主要的问题是,在Java中的定案是非常不可靠的,G1使它更糟,直到7.0.2。 故事的道德:尽快释放本土资源; 终结者太穷了。 -
与
java.util.zip.Deflater
相同的情况。 这个更糟,因为Deflater在Java中是饥饿的内存,也就是总是使用15位(最大)和8个内存级别(最大9)分配几百KB的本机内存。 幸运的是,Deflater
没有广泛使用,据我所知,JDK没有任何误用。 如果您手动创buildDeflater
或Inflater
始终调用end()
。 最后两个最好的部分: 你不能通过正常的分析工具find他们。
(我可以添加更多的时间浪费,我遇到了要求。)
祝你好运,保持安全; 泄漏是邪恶的!
答案完全取决于面试官认为他们在问什么。
是否可能在实践中使Java泄漏? 当然是这样,其他答案中有很多例子。
但是有多个元问题可能被问到?
- 理论上“完美”的Java实现是否容易泄漏?
- 候选人是否理解理论与现实之间的差异?
- 候选人是否了解垃圾收集的工作原理?
- 或者在理想的情况下垃圾回收应该如何工作?
- 他们知道他们可以通过本机接口调用其他语言吗?
- 他们知道以其他语言泄漏内存吗?
- 候选人是否知道内存pipe理是什么,以及Java在幕后发生了什么?
我正在阅读你的元问题:“我在这次访谈中可以使用什么答案”。 因此,我将专注于面试技巧而不是Java。 我相信你更可能重复在面试中不知道问题答案的情况,而不是你需要知道如何让Java泄漏的地方。 所以,希望这会有所帮助。
你可以开发面试的最重要的技能之一是学习积极听取问题,并与面试官一起提取他们的意图。 这不仅让你以他们想要的方式回答他们的问题,而且还表明你有一些重要的沟通技巧。 而当涉及到许多同样具有才华的开发人员之间的select时,我会聘请那些在每次回应之前倾听,思考和理解的人。
这里的大多数例子都是“太复杂”了。 他们是边缘情况。 有了这些例子,程序员犯了一个错误(比如不要redifining equals / hashcode),或者被JVM / JAVA(使用静态类的类加载)的一个angular落情况咬住了。 我认为这不是面试官想要的例子,甚至是最常见的情况。
但是内存泄漏的情况真的很简单。 垃圾收集器只释放不再被引用的内容。 我们作为Java开发人员不关心内存。 我们在需要时分配它,并让它自动释放。 精细。
但是任何长期的应用程序都倾向于共享状态。 它可以是任何东西,静态,单例…通常不平凡的应用程序倾向于制作复杂的对象graphics。 只是忘了设置一个引用为空或更经常忘记从集合中删除一个对象就足以造成内存泄漏。
当然,如果处理不当,所有types的监听器(如UI监听器),caching或任何长期共享状态往往会产生内存泄漏。 应该理解的是,这不是一个Java的angular落案例,或垃圾收集器的问题。 这是一个devise问题。 我们devise我们为一个长寿命的对象添加一个监听器,但是当不再需要的时候我们不会删除这个监听器。 我们caching对象,但是我们没有策略将其从caching中移除。
我们可能有一个复杂的图表,存储计算所需的以前的状态。 但之前的状态本身与之前的状态有联系等等。
就像我们必须closuresSQL连接或文件一样。 我们需要设置适当的引用来清空和删除集合中的元素。 我们将有适当的caching策略(最大内存大小,元素数量或定时器)。 所有允许通知侦听器的对象都必须同时提供addListener和removeListener方法。 当这些通知不再使用时,他们必须清除他们的听众名单。
内存泄漏确实是可能的,并且是完全可预测的。 无需特殊的语言function或angular落案件。 内存泄漏或者是某种可能丢失甚至是devise问题的指标。
如果你不懂JDBC ,以下是一个非常没有意义的例子。 或者至lessJDBC如何期望开发人员在抛弃它们或丢失对它们的引用之前closuresConnection
, Statement
和ResultSet
实例,而不是依赖于finalize
的实现。
void doWork() { try { Connection conn = ConnectionFactory.getConnection(); PreparedStatement stmt = conn.preparedStatement("some query"); // executes a valid query ResultSet rs = stmt.executeQuery(); while(rs.hasNext()) { ... process the result set } } catch(SQLException sqlEx) { log(sqlEx); } }
上面的问题是, Connection
对象没有closures,因此物理连接将保持打开状态,直到垃圾收集器出现并看到它是不可访问的。 GC将会调用finalize
方法,但是有些JDBC驱动程序并没有实现finalize
,至less不像Connection.close
实现一样。 由此产生的行为是由于收集到不可访问的对象而回收内存时,与Connection
对象关联的资源(包括内存)可能不会被回收。
在Connection
的finalize
方法没有清理所有事件的情况下,实际上可能会发现到数据库服务器的物理连接将持续几个垃圾收集周期,直到数据库服务器最终发现连接不存在如果是的话),应该closures。
即使JDBC驱动程序要实现finalize
,也可能在定稿过程中抛出exception。 由此产生的行为是,任何与现在“hibernate”对象相关联的内存将不会被回收,因为finalize
保证只能被调用一次。
上述在对象定型期间遇到exception的场景与另一个可能导致内存泄漏(对象复活)的场景有关。 对象复活通常是通过创build一个从另一个对象最终确定的对象的强引用来有目的地完成的。 当对象复活被滥用时,它会导致内存泄漏与其他内存泄漏源相结合。
还有更多的例子可以让你想起来 – 比如说
- pipe理一个
List
实例,你只需要添加到列表中,而不是从列表中删除(尽pipe你应该删除不再需要的元素),或者 - 打开
Socket
或File
,但在不再需要时closures它们(类似上面涉及Connection
类的例子)。 - closuresJava EE应用程序时不要卸载Singleton。 显然,加载单例类的类加载器将保留对类的引用,因此单例实例将永远不会被收集。 当部署新的应用程序实例时,通常会创build一个新的类加载器,而前一个类加载器将由于单例而继续存在。
ArrayList.remove(int)的实现可能是潜在的内存泄漏的最简单的例子之一,以及如何避免它。
public E remove(int index) { RangeCheck(index); modCount++; E oldValue = (E) elementData[index]; int numMoved = size - index - 1; if (numMoved > 0) System.arraycopy(elementData, index + 1, elementData, index, numMoved); elementData[--size] = null; // (!) Let gc do its work return oldValue; }
如果你自己实现,你是否想过清除不再使用的数组元素( elementData[--size] = null
)? 这个引用可能会使一个巨大的对象活着…
任何时候你不断的引用你不再需要的对象,就会产生内存泄漏。 请参阅处理Java程序中的内存泄漏,了解内存泄漏如何在Java中performance出自己的例子,以及您可以对此做些什么。
你可以使sun.misc.Unsafe类的内存泄漏。 事实上,这个服务类被用于不同的标准类(例如在java.nio类中)。 你不能直接创build这个类的实例 ,但你可以使用reflection来做到这一点 。
代码不能在Eclipse IDE中编译 – 使用命令javac
进行编译(在编译过程中,您将收到警告)
import java.lang.reflect.Constructor; import java.lang.reflect.Field; import sun.misc.Unsafe; public class TestUnsafe { public static void main(String[] args) throws Exception{ Class unsafeClass = Class.forName("sun.misc.Unsafe"); Field f = unsafeClass.getDeclaredField("theUnsafe"); f.setAccessible(true); Unsafe unsafe = (Unsafe) f.get(null); System.out.print("4..3..2..1..."); try { for(;;) unsafe.allocateMemory(1024*1024); } catch(Error e) { System.out.println("Boom :)"); e.printStackTrace(); } } }
我可以从这里复制我的答案: 在Java中导致内存泄漏的最简单的方法是什么?
“当计算机程序消耗内存,但无法将其释放回操作系统时,会发生计算机科学内存泄漏(或泄漏)。” (维基百科)
简单的答案是:你不能。 Java会自动执行内存pipe理,并释放不需要的资源。 你无法阻止这种情况的发生。 它总是能够释放资源。 在手动内存pipe理的程序中,这是不同的。 你不能使用malloc()在C中获得一些内存。 为了释放内存,你需要malloc返回的指针,并在其上调用free()。 但是如果你再也没有指针(被覆盖,或者超过了终身),那么你不幸的是不能释放这个内存,因此你就有内存泄漏。
到目前为止所有其他的答案在我的定义不是真正的内存泄漏。 他们的目标都是用真正快速的填充内容。 但是在任何时候,你仍然可以对你创build的对象进行解引用,从而释放内存 – >不漏。 acconrad的答案相当接近,但我不得不承认,因为他的解决scheme实际上是通过强制无限循环来“崩溃”垃圾回收器)。
长时间的答案是:你可以通过使用JNI编写一个Java库来获得内存泄漏,这个JNI可以手动进行内存pipe理,从而产生内存泄漏。 如果你调用这个库,你的java进程会泄漏内存。 或者,您可以在JVM中有错误,以便JVM丢失内存。 There are probably bugs in the JVM, there may even be some known ones since garbage collection is not that trivial, but then it's still a bug. By design this is not possible. You may be asking for some java code that is effected by such a bug. Sorry I don't know one and it might well not be a bug anymore in the next Java version anyway.
Here's a simple/sinister one via http://wiki.eclipse.org/Performance_Bloopers#String.substring.28.29 .
public class StringLeaker { private final String muchSmallerString; public StringLeaker() { // Imagine the whole Declaration of Independence here String veryLongString = "We hold these truths to be self-evident..."; // The substring here maintains a reference to the internal char[] // representation of the original string. this.muchSmallerString = veryLongString.substring(0, 1); } }
Because the substring refers to the internal representation of the original, much longer string, the original stays in memory. Thus, as long as you have a StringLeaker in play, you have the whole original string in memory, too, even though you might think you're just holding on to a single-character string.
The way to avoid storing an unwanted reference to the original string is to do something like this:
... this.muchSmallerString = new String(veryLongString.substring(0, 1)); ...
For added badness, you might also .intern()
the substring:
... this.muchSmallerString = veryLongString.substring(0, 1).intern(); ...
Doing so will keep both the original long string and the derived substring in memory even after the StringLeaker instance has been discarded.
Take any web application running in any servlet container (Tomcat, Jetty, Glassfish, whatever…). Redeploy the app 10 or 20 times in a row (it may be enough to simply touch the WAR in the server's autodeploy directory.
Unless anybody has actually tested this, chances are high that you'll get an OutOfMemoryError after a couple of redeployments, because the application did not take care to clean up after itself. You may even find a bug in your server with this test.
The problem is, the lifetime of the container is longer than the lifetime of your application. You have to make sure that all references the container might have to objects or classes of your application can be garbage collected.
If there is just one reference surviving the undeployment of your web app, the corresponding classloader and by consequence all classes of your web app cannot be garbage collected.
Threads started by your application, ThreadLocal variables, logging appenders are some of the usual suspects to cause classloader leaks.
A common example of this in GUI code is when creating a widget/component and adding a listener to some static/application scoped object and then not removing the listener when the widget is destroyed. Not only do you get a memory leak, but also a performance hit as when whatever you are listening to fires events, all your old listeners are called too.
Maybe by using external native code through JNI?
With pure Java, it is almost impossible.
But that is about a "standard" type of memory leak, when you cannot access the memory anymore, but it is still owned by the application. You can instead keep references to unused objects, or open streams without closing them afterwards.
I have had a nice "memory leak" in relation to PermGen and XML parsing once. The XML parser we used (I can't remember which one it was) did a String.intern() on tag names, to make comparison faster. One of our customers had the great idea to store data values not in XML attributes or text, but as tagnames, so we had a document like:
<data> <1>bla</1> <2>foo</> ... </data>
In fact, they did not use numbers but longer textual IDs (around 20 characters), which were unique and came in at a rate of 10-15 million a day. That makes 200 MB of rubbish a day, which is never needed again, and never GCed (since it is in PermGen). We had permgen set to 512 MB, so it took around two days for the out-of-memory exception (OOME) to arrive…
I recently encountered a memory leak situation caused in a way by log4j.
Log4j has this mechanism called Nested Diagnostic Context(NDC) which is an instrument to distinguish interleaved log output from different sources. The granularity at which NDC works is threads, so it distinguishes log outputs from different threads separately.
In order to store thread specific tags, log4j's NDC class uses a Hashtable which is keyed by the Thread object itself (as opposed to say the thread id), and thus till the NDC tag stays in memory all the objects that hang off of the thread object also stay in memory. In our web application we use NDC to tag logoutputs with a request id to distinguish logs from a single request separately. The container that associates the NDC tag with a thread, also removes it while returning the response from a request. The problem occurred when during the course of processing a request, a child thread was spawned, something like the following code:
pubclic class RequestProcessor { private static final Logger logger = Logger.getLogger(RequestProcessor.class); public void doSomething() { .... final List<String> hugeList = new ArrayList<String>(10000); new Thread() { public void run() { logger.info("Child thread spawned") for(String s:hugeList) { .... } } }.start(); } }
So an NDC context was associated with inline thread that was spawned. The thread object that was the key for this NDC context, is the inline thread which has the hugeList object hanging off of it. Hence even after the thread finished doing what it was doing, the reference to the hugeList was kept alive by the NDC context Hastable, thus causing a memory leak.
I thought it was interesting that no one used the internal class examples. If you have an internal class; it inherently maintains a reference to the containing class. Of course it is not technically a memory leak because Java WILL eventually clean it up; but this can cause classes to hang around longer than anticipated.
public class Example1 { public Example2 getNewExample2() { return this.new Example2(); } public class Example2 { public Example2() {} } }
Now if you call Example1 and get an Example2 discarding Example1, you will inherently still have a link to an Example1 object.
public class Referencer { public static Example2 GetAnExample2() { Example1 ex = new Example1(); return ex.getNewExample2(); } public static void main(String[] args) { Example2 ex = Referencer.GetAnExample2(); // As long as ex is reachable; Example1 will always remain in memory. } }
I've also heard a rumor that if you have a variable that exists for longer than a specific amount of time; Java assumes that it will always exist and will actually never try to clean it up if cannot be reached in code anymore. But that is completely unverified.
What's a memory leak:
- It's caused by a bug or bad design.
- It's a waste of memory.
- It gets worse over time.
- The garbage collector cannot clean it.
Typical example:
A cache of objects is a good starting point to mess things up.
private static final Map<String, Info> myCache = new HashMap<>(); public void getInfo(String key) { // uses cache Info info = myCache.get(key); if (info != null) return info; // if it's not in cache, then fetch it from the database info = Database.fetch(key); if (info == null) return null; // and store it in the cache myCache.put(key, info); return info; }
Your cache grows and grows. And pretty soon the entire database gets sucked into memory. A better design uses an LRUMap (Only keeps recently used objects in cache).
Sure, you can make things a lot more complicated:
- using ThreadLocal constructions.
- adding more complex reference trees .
- or leaks caused by 3rd party libraries .
What often happens:
If this Info object has references to other objects, which again have references to other objects. In a way you could also consider this to be some kind of memory leak, (caused by bad design).
Create a static Map and keep adding hard references to it. Those will never be GC'd.
public class Leaker { private static final Map<String, Object> CACHE = new HashMap<String, Object>(); // Keep adding until failure. public static void addToCache(String key, Object value) { Leaker.CACHE.put(key, value); } }
As a lot of people have suggested, Resource Leaks are fairly easy to cause – like the JDBC examples. Actual Memory leaks are a bit harder – especially if you aren't relying on broken bits of the JVM to do it for you…
The ideas of creating objects that have a very large footprint and then not being able to access them aren't real memory leaks either. If nothing can access it then it will be garbage collected, and if something can access it then it's not a leak…
One way that used to work though – and I don't know if it still does – is to have a three-deep circular chain. As in Object A has a reference to Object B, Object B has a reference to Object C and Object C has a reference to Object A. The GC was clever enough to know that a two deep chain – as in A <–> B – can safely be collected if A and B aren't accessible by anything else, but couldn't handle the three-way chain…
I came across a more subtle kind of resource leak recently. We open resources via class loader's getResourceAsStream and it happened that the input stream handles were not closed.
Uhm, you might say, what an idiot.
Well, what makes this interesting is: this way, you can leak heap memory of the underlying process, rather than from JVM's heap.
All you need is a jar file with a file inside which will be referenced from Java code. The bigger the jar file, the quicker memory gets allocated.
You can easily create such a jar with the following class:
import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.util.zip.ZipEntry; import java.util.zip.ZipOutputStream; public class BigJarCreator { public static void main(String[] args) throws IOException { ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(new File("big.jar"))); zos.putNextEntry(new ZipEntry("resource.txt")); zos.write("not too much in here".getBytes()); zos.closeEntry(); zos.putNextEntry(new ZipEntry("largeFile.out")); for (int i=0 ; i<10000000 ; i++) { zos.write((int) (Math.round(Math.random()*100)+20)); } zos.closeEntry(); zos.close(); } }
Just paste into a file named BigJarCreator.java, compile and run it from command line:
javac BigJarCreator.java java -cp . BigJarCreator
Et voilà: you find a jar archive in your current working directory with two files inside.
Let's create a second class:
public class MemLeak { public static void main(String[] args) throws InterruptedException { int ITERATIONS=100000; for (int i=0 ; i<ITERATIONS ; i++) { MemLeak.class.getClassLoader().getResourceAsStream("resource.txt"); } System.out.println("finished creation of streams, now waiting to be killed"); Thread.sleep(Long.MAX_VALUE); } }
This class basically does nothing, but create unreferenced InputStream objects. Those objects will be garbage collected immediately and thus, do not contribute to heap size. It is important for our example to load an existing resource from a jar file, and size does matter here!
If you're doubtful, try to compile and start the class above, but make sure to chose a decent heap size (2 MB):
javac MemLeak.java java -Xmx2m -classpath .:big.jar MemLeak
You will not encounter an OOM error here, as no references are kept, the application will keep running no matter how large you chose ITERATIONS in the above example. The memory consumption of your process (visible in top (RES/RSS) or process explorer) grows unless the application gets to the wait command. In the setup above, it will allocate around 150 MB in memory.
If you want the application to play safe, close the input stream right where it's created:
MemLeak.class.getClassLoader().getResourceAsStream("resource.txt").close();
and your process will not exceed 35 MB, independent of the iteration count.
Quite simple and surprising.
Everyone always forgets the native code route. Here's a simple formula for a leak:
- Declare native method.
- In native method, call
malloc
. Don't callfree
. - Call the native method.
Remember, memory allocations in native code come from the JVM heap.
You can create a moving memory leak by creating a new instance of a class in that class's finalize method. Bonus points if the finalizer creates multiple instances. Here's a simple program that leaks the entire heap in sometime between a few seconds and a few minutes depending on your heap size:
class Leakee { public void check() { if (depth > 2) { Leaker.done(); } } private int depth; public Leakee(int d) { depth = d; } protected void finalize() { new Leakee(depth + 1).check(); new Leakee(depth + 1).check(); } } public class Leaker { private static boolean makeMore = true; public static void done() { makeMore = false; } public static void main(String[] args) throws InterruptedException { // make a bunch of them until the garbage collector gets active while (makeMore) { new Leakee(0).check(); } // sit back and watch the finalizers chew through memory while (true) { Thread.sleep(1000); System.out.println("memory=" + Runtime.getRuntime().freeMemory() + " / " + Runtime.getRuntime().totalMemory()); } } }
Threads are not collected until they terminate. They serve as roots of garbage collection. They are one of the few objects that won't be reclaimed simply by forgetting about them or clearing references to them.
Consider: the basic pattern to terminate a worker thread is to set some condition variable seen by the thread. The thread can check the variable periodically and use that as a signal to terminate. If the variable is not declared volatile
, then the change to the variable might not be seen by the thread, so it won't know to terminate. Or imagine if some threads want to update a shared object, but deadlock while trying to lock on it.
If you only have a handful of threads these bugs will probably be obvious because your program will stop working properly. If you have a thread pool that creates more threads as needed, then the obsolete/stuck threads might not be noticed, and will accumulate indefinitely, causing a memory leak. Threads are likely to use other data in your application, so will also prevent anything they directly reference from ever being collected.
As a toy example:
static void leakMe(final Object object) { new Thread() { public void run() { Object o = object; for (;;) { try { sleep(Long.MAX_VALUE); } catch (InterruptedException e) {} } } }.start(); }
Call System.gc()
all you like, but the object passed to leakMe
will never die.
(*edited*)
I don't think anyone has said this yet: you can resurrect an object by overriding the finalize() method such that finalize() stores a reference of this somewhere. The garbage collector will only be called once on the object so after that the object will never destroyed.
there are many different situations memory will leak. One i encountered, which expose a map that should not be exposed and used in other place.
public class ServiceFactory { private Map<String, Service> services; private static ServiceFactory singleton; private ServiceFactory() { services = new HashMap<String, Service>(); } public static synchronized ServiceFactory getDefault() { if (singleton == null) { singleton = new ServiceFactory(); } return singleton; } public void addService(String name, Service serv) { services.put(name, serv); } public void removeService(String name) { services.remove(name); } public Service getService(String name, Service serv) { return services.get(name); } // the problematic api, which expose the map. //and user can do quite a lot of thing from this api. //for example, create service reference and forget to dispose or set it null //in all this is a dangerous api, and should not expose public Map<String, Service> getAllServices() { return services; } } // resource class is a heavy class class Service { }
An example I recently fixed is creating new GC and Image objects, but forgetting to call dispose() method.
GC javadoc snippet:
Application code must explicitly invoke the GC.dispose() method to release the operating system resources managed by each instance when those instances are no longer required. This is particularly important on Windows95 and Windows98 where the operating system has a limited number of device contexts available.
Image javadoc snippet:
Application code must explicitly invoke the Image.dispose() method to release the operating system resources managed by each instance when those instances are no longer required.
I think that a valid example could be using ThreadLocal variables in an environment where threads are pooled.
For instance, using ThreadLocal variables in Servlets to communicate with other web components, having the threads being created by the container and maintaining the idle ones in a pool. ThreadLocal variables, if not correctly cleaned up, will live there until, possibly, the same web component overwrites their values.
Of course, once identified, the problem can be solved easily.
The interviewer might have be looking for a circular reference solution:
public static void main(String[] args) { while (true) { Element first = new Element(); first.next = new Element(); first.next.next = first; } }
This is a classic problem with reference counting garbage collectors. You would then politely explain that JVMs use a much more sophisticated algorithm that doesn't have this limitation.
-Wes Tarle