Python中的内存泄露

2021年5月10日 / 84次阅读 / Last Modified 2021年5月10日

是的,python写的程序也存在内存泄露!OMG......

python虽然有自己的garbage collector,按照一定的规则(gc.get_threshold()返回的默认值700,10,10),定期执行垃圾清理。那些reference count为0的对象,会被清理掉。就算是循环引用,只要已经变成了unreachable,python也会一并清理掉。(但官方似乎并没有将循环引用的case说的特别清楚)

人们选择使用Python这样的脚本语言,一个重要的原因就是开发效率高,开发者可以不用关心与内存有关的各种麻烦和痛苦,企业成本低,开发完成的代码量少,bug也相对较少,在对执行效率不是特别在意的场景下,这很重要!

Python的内存泄露是如何发生的呢?

python中虽有的变量都是对象的引用,只有当引用数(reference count)为0,或者已经unreachable的循环引用,才会在collection的过程中被清理掉。由于引用的传递太灵活,也常常太随意,有一些对象其实已经再也用不到了,但是其引用还在,还不为0,代码也没有流程能够将这些引用清理掉(比如=None),在不停地迭代运行或长期运行之后,内存就慢慢被这些再也用不到但是引用数还不是0的对象占满了(他们再不断地被创建出来,用完后就被隐藏在某个非常不起眼的角落,但引用还在,garbage collector不会清理他们)。

我最近就被一个内存泄露的问题折磨了好几天。代码中一个很占内存的对象被创建出来,使用,然后放弃(但引用没有删除),然后再创建一个出来......,很快程序就卡了,仔细一分析,是系统内存被这个python程序耗尽了。

慎用 __del__:

在最新的python官方说明中,对于gc.garbage list有一个说明:Changed in version 3.4: Following PEP 442, objects with a __del__() method don’t end up in gc.garbage anymore.

一般循环引用(circle reference),python的garbage collector都可以搞定,但是对于定义了 __del__ magic method的对象是例外,因为gc在执行collect的时候,对于足circle reference对象,没法判断先调用谁的del。因此,他们就成了 unreachable and uncollectable!

gc.garbage
list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. Python doesn’t collect such cycles automatically because, in general, it isn’t possible for Python to guess a safe order in which to run the __del__() methods.

python官方咋FAQ中有一段这样的说明:

My class defines __del__ but it is not called when I delete the object.

There are several possible reasons for this.

The del statement does not necessarily call __del__() – it simply decrements the object’s reference count, and if this reaches zero __del__() is called.

If your data structures contain circular links (e.g. a tree where each child has a parent reference and each parent has a list of children) the reference counts will never go back to zero. Once in a while Python runs an algorithm to detect such cycles, but the garbage collector might run some time after the last reference to your data structure vanishes, so your __del__() method may be called at an inconvenient and random time. This is inconvenient if you’re trying to reproduce a problem. Worse, the order in which object’s __del__() methods are executed is arbitrary. You can run gc.collect() to force a collection, but there are pathological cases where objects will never be collected.

Despite the cycle collector, it’s still a good idea to define an explicit close() method on objects to be called whenever you’re done with them. The close() method can then remove attributes that refer to subobjects. Don’t call __del__() directly – __del__() should call close() and close() should make sure that it can be called more than once for the same object.

Another way to avoid cyclical references is to use the weakref module, which allows you to point to objects without incrementing their reference count. Tree data structures, for instance, should use weak references for their parent and sibling references (if they need them!).

Finally, if your __del__() method raises an exception, a warning message is printed to sys.stderr.

好了,Python的程序一样有内存泄露,要注意,以后在代码中,要注意引用什么时候被解除,是不是考虑不要传递引用,而是传递index等等.......我想关于python内存泄露的问题,应该是编写大型python系统必须要考虑的重要细节!

-- EOF --

本文链接:https://www.pynote.net/archives/3651

相关文章

    留言区

    《Python中的内存泄露》有3条留言

    您的电子邮箱地址不会被公开。 必填项已用*标注

    • 麦新杰

      第二是,循环引用中的对象定义了__del__函数,简而言之,如果定义了__del__函数,那么在循环引用中Python解释器无法判断析构对象的顺序,因此就不错处理。 [回复]

    • bck

      你都没有把内存泄漏的代码段和解决方法贴出来吗,解除引用好像是del xx ,不是xx=None。 [回复]

      • 麦新杰

        del xx 和 xx=None,效果是一样的! [回复]


    前一篇:
    后一篇:

    More


    ©Copyright 麦新杰 Since 2019 Python笔记

    go to top