Openstack Nova: не удается загрузить экземпляр из образа (создать новый том) из-за тайм-аута для больших образов – Up level work support of Linux servers. Modern Clouds

Хотя проблема по сути были исправлена в ранних релизах, и жестко закодированные таймауты ожидания готовности блочных криворуким пиностом были устранены, все же хочется напомнить “юным” инсталляторщикам сложных систем вид ошибки, которую они могу встретить и два параметра которые могут помочь.

Итак в лог может вывалиться следующий блок от менеджера Nova на узле виртуализации:

1
2
3
4
5
6
7
8
9
10
11

48 2018-01-02 15:37:53.465 28582 ERROR nova.compute.manager
[instance: d60edaa9-f3bc-4403-b4bf-db33e17811f7] wait_func(context, volume_id)
49 2018-01-02 15:37:53.465 28582 ERROR nova.compute.manager
[instance: d60edaa9-f3bc-4403-b4bf-db33e17811f7] File "/usr/lib/python2.7/site-packages/nova/c
50 te/manager.py", line 1430, in _await_block_device_map_created
51 2018-01-02 15:37:53.465 28582 ERROR nova.compute.manager
[instance: d60edaa9-f3bc-4403-b4bf-db33e17811f7] volume_status=volume_status)
52 2018-01-02 15:37:53.465 28582 ERROR nova.compute.manager
[instance: d60edaa9-f3bc-4403-b4bf-db33e17811f7] VolumeNotCreated: Volume 02d73f68-5e27-4799-ab9
53 42e4eb50cd did not finish being created even after we waited 198 seconds or 61 attempts.
And its status is creating.

Это уже обсуждали адепты пинотовских “обвязок” – https://bugzilla.redhat.com/show_bug.cgi?id=1019401

Так что самое время воспользоваться этим ответом, как минимум для релизов 2016-2017 годов и далее:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

Lee Yarwood 2015-10-16 12:47:24 EDT

(In reply to Andres Toomsalu from comment #12)
> Just for feedback: this issue is still very much alive and causing problems
> in production deployments with volume (SAN) backends - where volume sizes
> are larger than in development environments. Affects heavily backup/snaphot
> restoration process - which easily run into timeout limits.

(In reply to jwang from comment #15)
> I hit this issue again on RHELOSP6.
>
> 1.
> Cinder backend is LVM
>
> 2.
> Glance image virtual size is 110G

Hello Dafna, Andres, jwang, Jack, can you confirm which version of nova you are using in your environments?

I believe the following change introduced configurables in Juno / RHEL OSP 6 and then Icehouse / RHEL OSP 5 (via 2014.1.4) that can be used here :

[juno] Make the block device mapping retries configurable
https://review.openstack.org/#/c/102891/

[stable/icehouse] Make the block device mapping retries configurable
https://review.openstack.org/#/c/129276/

~~~
Make the block device mapping retries configurable

When booting instances passing in block-device and increasing the
volume size, instances can go in to error state if the volume takes
longer to create than the hard code value (max_tries(180)/wait_between(1))
set in nova/compute/manager.py
def _await_block_device_map_created(self,
context,
vol_id,
max_tries=180,
wait_between=1):
To fix this, max_retries/wait_between should be made configurable.
Looking through the different releases, Grizzly was 30, Havana was
60 , IceHouse is 180.
This change adds two configuration options:
a) `block_device_allocate_retries` which can be set in nova.conf
by the user to configure the number of block device mapping retries.
It defaults to 60 and replaces the max_tries argument in the above method.
b) `block_device_allocate_retries_interval` which allows the user
to specify the time interval between consecutive retries. It defaults to 3
and replaces wait_between argument in the above method.
~~~

You May Also Like

Protected: Openstack: daemon for collecting statistics of billing SWIFT storage based on supervisord with threads.

Openstack CLI: Unset all variables beginning with prefix OS_

VIRSH:KVM – Запуск удаленного доступа VNC для гостевых операционных систем