Report on Eventlet Discussions at the OpenStack Flamingo PTG
Report created by Hervé Beraud at April 17, 2025
This report presents a full analysis of all the transversal discussions regarding the migration away from Eventlet in OpenStack, as documented in the etherpads from the April 2025 Project Teams Gathering (PTG).
Context and Importance 🔗
OpenStack has extensively used Eventlet as a concurrency solution for its services. However, this dependency has become problematic for several reasons:
- Incompatibilities with future Python versions: Eventlet has significant issues with Python 3.13 and PEP 703 ("GILectomy").
- Declining maintenance: Eventlet's maintenance activity is decreasing, making it difficult to resolve numerous bugs.
- Integration problems: Several incompatibilities have been identified between Eventlet and various libraries such as RabbitMQ.
The OpenStack community has launched an official goal to remove Eventlet due to growing issues with this library, particularly in light of future Python evolutions.
Timeline and Coordination
Initial Schedule
Planned for the 2025.1/2025.2 cycles
New Timeline
The new timeline is available here
Coordination
OFTC channel #openstack-eventlet-removal established to facilitate collaboration
Specific Issues Identified 🔗
RabbitMQ Heartbeat
A problem mentioned concerns heartbeat failures with RabbitMQ:
- Timeouts and API failures occur because RabbitMQ does not support Eventlet's "green" environments well
- This is a well know problem
- A partial solution using heartbeats in pthread exists but has problems with logging
- A patch for oslo.log is pending: review.opendev.org/937729
WSGI Server Performance
Comparative tests mentioned in Swift's etherpad show:
Server | 1 worker | 4 workers | 8 workers | 16 workers | Notes |
---|---|---|---|---|---|
Eventlet WSGI | 8,135 | - | - | - | Only uses 1 process/core |
Gunicorn WSGI | 5,465 | 19,868 | 25,146 | 38,474 | - |
FastWsgi WSGI | 83,766 | - | - | - | Only uses 1 thread/core |
Uvicorn ASGI | 4,169 | 2,048 | 2,161 | 2,156 | - |
Bjoern WSGI | 2,028 | - | - | - | Only uses 1 thread/core |
These figures (requests/second) show significant performance differences between servers:
- FastWsgi: Appears significantly superior with
83,766 req/s
on a single thread/core. - Gunicorn: Demonstrates impressive scalability with increasing worker counts (up to
38,474 req/s
with 16 workers). - Eventlet: Offers good performance for a single process (
8,135 req/s
) but is limited to a single process/core. - Uvicorn (ASGI) and Bjoern: Show more modest performance.
Technical Committee (TC) Perspective
- Eventlet removal is identified as one of the top priorities for the community
- Sees it as more important than ever due to upcoming Python changes
- Recognizes that it requires a shift in thinking about project ownership, focusing more on accomplishing goals collectively
- Suggests potential rootwrap deprecation during the Eventlet removal process
- Planning to collect information about the current status of migration across projects
Oslo.service New Backend
A key development is the new Threading backend for oslo.service that no longer depends on Eventlet:
- Change in progress: review.opendev.org/945720
- No longer provides WSGI support
- Each service will need to deprecate implementations that depend on Eventlet's WSGI server
- Documentation patch: review.opendev.org/940664
Recommended Migration Strategy 🔗
Prioritization
- Start with services rather than libraries
- Focus first on eliminating the Eventlet WSGI server
- Then migrate other concurrency features (greenpools, threadpools)
Practical Steps
- Identify Eventlet use cases specific to each project
- Prepare migration by following the guidelines
- Gradually deprecate Eventlet-related features
- Develop alternatives for shared and private APIs
Challenges and Open Questions 🔗
Dual Compatibility During Transition
An important debate concerns the possibility of temporarily maintaining services that work both with and without Eventlet:
Arguments for
- Would allow falling back to Eventlet if bugs are discovered in the native thread mode
- Would facilitate a gradual and safer migration
Arguments against
- Considerable complexity in maintaining two different concurrency modes
- Eventlet itself already contains numerous bugs, making a fallback strategy difficult to implement
- Difficulties related to side effects of Eventlet's monkey patching
Project-specific decisions
- Nova: After discussions, the project has decided to support both modes (eventlet and native threads) during the transition to avoid a "big bang" migration. A variable environment mechanism (
OS_NOVA_DISABLE_EVENTLET_PATCHING=1
) is planned to control the concurrency mode. - Glance: Has been usable with both eventlet and native threads for several years, demonstrating the feasibility of dual mode support.
Partial vs Complete Migration
Experiences vary across projects:
Neutron: Encountered challenges with a gradual approach
Glance: Successfully maintained compatibility with both modes for several years
Swift: Considering a "canary" node approach starting with proxies
Nova: Plans to go service by service, with some services running in native thread mode while others still use eventlet in the same release
Octavia: Has demonstrated a successful complete migration, now serving as a case study
Mistral: Has chosen a comprehensive approach, nearly completing its migration with minimal incremental steps
Manila: Adopting a phased approach, planning for completion over multiple cycles
Cinder: Taking a component-by-component approach, starting with Volume Manager
Heat: Planning a complete discontinuation of WSGI server implementations
Ironic: Facing complex migration challenges, particularly with IPA
Designate: Focusing on a complete migration as their top priority for the Flamingo cycle
Blazar: Considering a gradual transition, evaluating different WSGI alternatives
Current Migration Status by Project 🔗
Projects with Significant Progress
Octavia
- Has successfully migrated away from Eventlet since 2017
- Their approach is now documented as a case study to help other projects
- Case study available at: removal.eventlet.org/guide/case-studies/octavia/
- Uses cotyledon
Neutron
- Has made significant progress in removing its dependency on Eventlet through numerous changes
- Successfully implemented an approach where code can work with both Eventlet and native threads during transition
- Many patches found under topic "eventlet-removal": Neutron patches
Mistral
- Has almost completed its migration from Eventlet
- Many of the changes are found under: Mistral patches
Glance
- Can be deployed without Eventlet
- Some optional features (scrubber) still depend on it
- Tests still depend on Eventlet
- Has been usable with both Eventlet and native threads for several years
Oslo Libraries
- Have deprecated all their Eventlet compatibility features
- Working on a new Threading backend for oslo.service that doesn't depend on Eventlet
- Added asyncio support in oslo.db, enabling asynchronous database operations
- Significant progress in making core libraries Eventlet-free
Projects with Plans in Progress
Nova
- Has decided to support both modes during the transition
- Planning to use an environment variable mechanism (
OS_NOVA_DISABLE_EVENTLET_PATCHING=1
) - Plans to go service-by-service with some services running in native thread mode while others still use Eventlet
Swift
- Has started discussions about alternative WSGI servers
- Considering options including Gunicorn, Uvicorn, FastWsgi, or Bjoern
- Planning a Proof of Concept (POC) using the proxy server as a starter
- Considering a "canary node" approach to migration, starting with proxies
Manila
- Removed monkey patching in the client during the Epoxy cycle
- Planning to remove WSGI uses and adopt oslo.service's new Threading-based backend
- Working on this in the Flamingo cycle, aiming for completion in the G cycle (Guppy)
- Looking at Neutron's progress with periodic tasks for inspiration
Cinder
- Has started working on removing Eventlet dependencies during the Flamingo cycle
- Taking a gradual approach with multiple team members involved
- Planning to start with the Volume Manager, which is a key component
- Will apply lessons learned from Volume Manager work to the Backup Manager later
Heat
- Has identified Eventlet removal as one of their upcoming changes
- Planning to use the new oslo.service implementation without Eventlet once available
- Will not provide a WSGI server implementation with thread model
- Preferring external server mechanisms such as uwsgi or httpd+mod_wsgi
Designate
- Has identified Eventlet removal as their top priority for the Flamingo cycle
- Started tracking relevant changes in their Gerrit repository
Blazar
- Acknowledges the need to move away from Eventlet's WSGI implementation
- Planning to reuse examples from other projects as they migrate away from Eventlet
- Considering alternatives including mod_wsgi and uwsgi
Watcher
- Needs to examine how Eventlet is used in the project
- Start proof of concept work for removal
- Acknowledges timing pressure due to Ubuntu 2025.4 shipping with Python 3.13 by default
Ironic
- Has identified Eventlet removal as one of their priorities for the Flamingo cycle
- Facing several challenges with different components such as IPA
- Recognizes that one person can't own the entire migration
- Team members are volunteering for specific components
Nova's Specific Migration Plan 🔗
Nova has dedicated significant discussion to their Eventlet removal approach during the Flamingo PTG:
Flamingo Cycle Tasks (2025.2)
API Modernization
- Deprecate the oslo.service.wsgi based standalone eventlet server mode for nova-api
- Remove direct eventlet imports where possible
- Replace eventlet primitives with stdlib primitives where possible
Architecture Changes
- Create an entry point wrapper to configure concurrency mode via ENV variable
- Move Nova commands that don't use Eventlet to a separate module without monkey_patch
- Check novncproxy Eventlet usage and non-monkey-patched performance
Performance Improvements
- Remove Eventlet from nova-api scatter/gather logic
- Add metric gathering about threadpool state when load is high
- Add SQL statement timeout for nova-api
- Enable connection pooling (qpool) for timeout implementation
Testing Strategy
- Not maintaining functional tests to work in both threading modes
- Moving tests to native threading mode to uncover issues early
- Using Tempest jobs to test both threading modes
Guppy Cycle Tasks (2026.1)
- Converting the core event loop to native threads
- Using the new thread-based backend from oslo.service when available
Tracking
- Blueprint: blueprints.launchpad.net/nova/+spec/eventlet-removal-part-1
- Progress etherpad: etherpad.opendev.org/p/nova-eventlet-removal
- Developer blog: Balazs Gibizer's blog on Eventlet removal progress
Resources 🔗
Documentation & Guidelines
Development Resources
Conclusion 🔗
The migration away from Eventlet represents a major technical challenge for OpenStack but is becoming increasingly urgent given Python's evolution and growing issues with Eventlet. Discussions during the April 2025 PTG show consensus on the necessity of this migration, with significant progress in certain projects and an increasingly clear strategy.
The success of this migration will require continuous coordination between teams and a pragmatic approach to managing the transition, taking into account the specificities of each project. Projects like Nova are adopting a dual-mode strategy to ensure stability during the transition period, while leveraging lessons learned from projects that have already made significant progress such as Octavia, Neutron, and Mistral. Octavia's complete migration since 2017 serves as a particularly valuable case study, demonstrating that a full transition is achievable and sustainable over the long term.