Copying Data Structures Quirks

Note

TL;DR — Copying fails if a value is an iterator.

To create a duplicate of a data structure, Python provides methods copy() and deepcopy(). Lately, I wanted to duplicate a nested structure with deepcopy() and encountered following issue:

TypeError: object.__new__(dict_keys) is not safe, use dict_keys.__new__()

And the traceback pointed somewhere into module copy from the standard library.

I still don’t understand the full meaning of that message, but it indicates, the problem has sth. to do with copying objects, rather than values.

Note

Running Python 2 code in Py3k may fumble here!

Indeed! One key in the structure I had the keys of another dict assigned to using keys(). The code originated in Python 2, where keys() returns a list, which deepcopies perfectly in Python 2. Not so in Python 3, where keys() returns an iterator!

The tool 2to3 accounts for this and transmogrifies keys() into list( keys() ). In most circumstances however, the iterator returned in Python 3 is ok. One exception is like here, when you need to duplicate the result of keys(): cast it to a list then.

Code to reproduce:

In [8]: k1 = d.keys()

In [9]: k1
Out[9]: dict_keys(['foo', 'bar'])

In [10]: k2 = list(d.keys())

In [11]: k2
Out[11]: ['foo', 'bar']

In [12]: a = dict(xxx=22, yyy=33, zzz=k1)

In [13]: b = dict(xxx=22, yyy=33, zzz=k2)

In [14]: a2=copy.deepcopy(a)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-14-5250b577bd4b> in <module>()
----> 1 a2=copy.deepcopy(a)

/usr/lib/python3.2/copy.py in deepcopy(x, memo, _nil)
    145     copier = _deepcopy_dispatch.get(cls)
    146     if copier:
--> 147         y = copier(x, memo)
    148     else:
    149         try:

/usr/lib/python3.2/copy.py in _deepcopy_dict(x, memo)
    234     memo[id(x)] = y
    235     for key, value in x.items():
--> 236         y[deepcopy(key, memo)] = deepcopy(value, memo)
    237     return y
    238 d[dict] = _deepcopy_dict

/usr/lib/python3.2/copy.py in deepcopy(x, memo, _nil)
    172                             raise Error(
    173                                 "un(deep)copyable object of type %s" % cls)
--> 174                 y = _reconstruct(x, rv, 1, memo)
    175
    176     memo[d] = y

/usr/lib/python3.2/copy.py in _reconstruct(x, info, deep, memo)
    283     if deep:
    284         args = deepcopy(args, memo)
--> 285     y = callable(*args)
    286     memo[id(x)] = y
    287

/usr/lib/python3.2/copyreg.py in __newobj__(cls, *args)
     86
     87 def __newobj__(cls, *args):
---> 88     return cls.__new__(cls, *args)
     89
     90 def _slotnames(cls):

TypeError: object.__new__(dict_keys) is not safe, use dict_keys.__new__()

In [15]: b2=copy.deepcopy(b)

This behaviour seems accepted in the Python community, and it is not likely that module copy will implement copying of iterators (or objects). See this bug for details.

[UPDATE]

Here are some interesting reads:

About threading issues using list(keys()), which is not atomic.

About more differences between Python 2 and 3.

PEP 3106: Revamping dict.keys(), .values() and .items()