Content here is by Michael Still mikal@stillhq.com. All opinions are my own.
See recent comments. RSS feed of all comments.


Sun, 14 Dec 2014



Ghost

posted at: 15:48 | path: /book/John_Ringo | permanent link to this entry


How are we going with Nova Kilo specs after our review day?

posted at: 15:15 | path: /openstack/kilo | permanent link to this entry


Soft deleting instances and the reclaim_instance_interval in Nova

posted at: 13:51 | path: /openstack | permanent link to this entry


Mon, 01 Dec 2014



Specs for Kilo, an update

posted at: 20:13 | path: /openstack/kilo | permanent link to this entry


Tue, 25 Nov 2014



The Human Division

posted at: 17:14 | path: /book/John_Scalzi | permanent link to this entry


Sun, 16 Nov 2014



Fast Food Nation

posted at: 01:43 | path: /book/Eric_Schlosser | permanent link to this entry


Thu, 23 Oct 2014



Specs for Kilo

posted at: 19:27 | path: /openstack/kilo | permanent link to this entry


Mon, 13 Oct 2014



One week of Nova Kilo specifications

posted at: 03:27 | path: /openstack/kilo | permanent link to this entry


Sun, 12 Oct 2014



Compute Kilo specs are open

    From my email last week on the topic:
    I am pleased to announce that the specs process for nova in kilo is
    now open. There are some tweaks to the previous process, so please
    read this entire email before uploading your spec!
    
    Blueprints approved in Juno
    ===========================
    
    For specs approved in Juno, there is a fast track approval process for
    Kilo. The steps to get your spec re-approved are:
    
     - Copy your spec from the specs/juno/approved directory to the
    specs/kilo/approved directory. Note that if we declared your spec to
    be a "partial" implementation in Juno, it might be in the implemented
    directory. This was rare however.
     - Update the spec to match the new template
     - Commit, with the "Previously-approved: juno" commit message tag
     - Upload using git review as normal
    
    Reviewers will still do a full review of the spec, we are not offering
    a rubber stamp of previously approved specs. However, we are requiring
    only one +2 to merge these previously approved specs, so the process
    should be a lot faster.
    
    A note for core reviewers here -- please include a short note on why
    you're doing a single +2 approval on the spec so future generations
    remember why.
    
    Trivial blueprints
    ==================
    
    We are not requiring specs for trivial blueprints in Kilo. Instead,
    create a blueprint in Launchpad
    at https://blueprints.launchpad.net/nova/+addspec and target the
    specification to Kilo. New, targeted, unapproved specs will be
    reviewed in weekly nova meetings. If it is agreed they are indeed
    trivial in the meeting, they will be approved.
    
    Other proposals
    ===============
    
    For other proposals, the process is the same as Juno... Propose a spec
    review against the specs/kilo/approved directory and we'll review it
    from there.
    


    After a week I'm seeing something interesting. In Juno the specs process was new, and we saw a pause in the development cycle while people actually wrote down their designs before sending the code. This time around people know what to expect, and there are left over specs from Juno lying around. We're therefore seeing specs approved much faster than in Kilo. This should reduce the effect of the "pipeline flush" that we saw in Juno.

    So far we have five approved specs after only a week.

    Tags for this post: openstack kilo blueprints spec nova
    Related posts: One week of Nova Kilo specifications; How are we going with Nova Kilo specs after our review day?; Specs for Kilo; Specs for Kilo; Juno nova mid-cycle meetup summary: nova-network to Neutron migration; Thoughts from the PTL

posted at: 16:39 | path: /openstack/kilo | permanent link to this entry


Wed, 08 Oct 2014



Lock In

posted at: 02:43 | path: /book/John_Scalzi | permanent link to this entry


Tue, 30 Sep 2014



On layers

    There's been a lot of talk recently about what we should include in OpenStack and what is out of scope. This is interesting, in that many of us used to believe that we should do ''everything''. I think what's changed is that we're learning that solving all the problems in the world is hard, and that we need to re-focus on our core products. In this post I want to talk through the various "layers" proposals that have been made in the last month or so. Layers don't directly address what we should include in OpenStack or not, but they are a useful mechanism for trying to break up OpenStack into simpler to examine chunks, and I think that makes them useful in their own right.

    I would address what I believe the scope of the OpenStack project should be, but I feel that it makes this post so long that no one will ever actually read it. Instead, I'll cover that in a later post in this series. For now, let's explore what people are proposing as a layering model for OpenStack.

    What are layers?

    Dean Troyer did a good job of describing a layers model for the OpenStack project on his blog quite a while ago. He proposed the following layers (this is a summary, you should really read his post):

    • layer 0: operating system and Oslo
    • layer 1: basic services -- Keystone, Glance, Nova
    • layer 2: extended basics -- Neutron, Cinder, Swift, Ironic
    • layer 3: optional services -- Horizon and Ceilometer
    • layer 4: turtles all the way up -- Heat, Trove, Moniker / Designate, Marconi / Zaqar


    Dean notes that Neutron would move to layer 1 when nova-network goes away and Neutron becomes required for all compute deployments. Dean's post was also over a year ago, so it misses services like Barbican that have appeared since then. Services are only allowed to require services from lower numbered layers, but can use services from higher number layers as optional add ins. So Nova for example can use Neutron, but cannot require it until it moves into layer 1. Similarly, there have been proposals to add Ceilometer as a dependency to schedule instances in Nova, and if we were to do that then we would need to move Ceilometer down to layer 1 as well. (I think doing that would be a mistake by the way, and have argued against it during at least two summits).

    Sean Dague re-ignited this discussion with his own blog post relatively recently. Sean proposes new names for most of the layers, but the intent remains the same -- a compute-centric view of the services that are required to build a working OpenStack deployment. Sean and Dean's layer definitions are otherwise strongly aligned, and Sean notes that the probability of seeing something deployed at a given installation reduces as the layer count increases -- so for example Trove is way less commonly deployed than Nova, because the set of people who want a managed database as a service is smaller than the set of of people who just want to be able to boot instances.

    Now, I'm not sure I agree with the compute centric nature of the two layers proposals mentioned so far. I see people installing just Swift to solve a storage problem, and I think that's a completely valid use of OpenStack and should be supported as a first class citizen. On the other hand, resolving my concern with the layers model there is trivial -- we just move Swift to layer 1.

    What do layers give us?

    Sean makes a good point about the complexity of OpenStack installs and how we scare away new users. I agree completely -- we show people our architecture diagrams which are deliberately confusing, and then we wonder why they're not impressed. I think we do it because we're proud of the scope of the thing we've built, but I think our audiences walk away thinking that we don't really know what problem we're trying to solve. Do I really need to deploy Horizon to have working compute? No of course not, but our architecture diagrams don't make that obvious. I gave a talk along these lines at pyconau, and I think as a community we need to be better at explaining to people what we're trying to do, while remembering that not everyone is as excited about writing a whole heap of cloud infrastructure code as we are. This is also why the OpenStack miniconf at linux.conf.au 2015 has pivoted from being a generic OpenStack chatfest to being something more solidly focussed on issues of interest to deployers -- we're just not great at talking to our users and we need to reboot the conversation at community conferences until its something which meets their needs.


    We intend this diagram to amaze and confuse our victims


    Agreeing on a set of layers gives us a framework within which to describe OpenStack to our users. It lets us communicate the services we think are basic and always required, versus those which are icing on the cake. It also let's us explain the dependency between projects better, and that helps deployers work out what order to deploy things in.

    Do layers help us work out what OpenStack should focus on?

    Sean's blog post then pivots and starts talking about the size of the OpenStack ecosystem -- or the "size of our tent" as he phrases it. While I agree that we need to shrink the number of projects we're working on at the moment, I feel that the blog post is missing a logical link between the previous layers discussion and the tent size conundrum. It feels to me that Sean wanted to propose that OpenStack focus on a specific set of layers, but didn't quite get there for whatever reason.

    Next Monty Taylor had a go at furthering this conversation with his own blog post on the topic. Monty starts by making a very important point -- he (like all involved) both want the OpenStack community to be as inclusive as possible. I want lots of interesting people at the design summits, even if they don't work directly on projects that OpenStack ships. You can be a part of the OpenStack community without having our logo on your product.

    A concrete example of including non-OpenStack projects in our wider community was visible at the Atlanta summit -- I know for a fact that there were software engineers at the summit who work on Google Compute Engine. I know this because I used to work with them at Google when I was a SRE there. I have no problem with people working on competing products being at our summits, as long as they are there to contribute meaningfully in the sessions, and not just take from us. It needs to be a two way street. Another concrete example is Ceph. I think Ceph is cool, and I'm completely fine with people using it as part of their OpenStack deploy. What upsets me is when people conflate Ceph with OpenStack. They are different. They're separate. And that is fine. Let's just not confuse people by saying Ceph is part of the OpenStack project -- it simply isn't because it doesn't fall under our governance model. Ceph is still a valued member of our community and more than welcome at our summits.

    Do layers help us work our what to focus OpenStack on for now? I think they do. Should we simply say that we're only going to work on a single layer? Absolutely not. What we've tried to do up until now is have OpenStack be a single big thing, what we call "the integrated release". I think layers gives us a tool to find logical ways to break that thing up. Perhaps we need a smaller integrated release, but then continue with the other projects but on their own release cycles? Or perhaps they release at the same time, but we don't block the release of a layer 1 service on the basis of release critical bugs in a layer 4 service?

    Is there consensus on what sits in each layer?

    Looking at the posts I can find on this topic so far, I'd have to say the answer is no. We're close, but we're not aligned yet. For example, one proposal has a tweak to the previously proposed layer model that adds Cinder, Designate and Neutron down into layer 1 (basic services). The author argues that this is because stateless cloud isn't particularly useful to users of OpenStack. However, I think this is wrong to be honest. I can see that stateless cloud isn't super useful by itself, but we are assuming that OpenStack is the only piece of infrastructure that a given organization has. Perhaps that's true for the public cloud case, but the vast majority of OpenStack deployments at this point are private clouds. So, you're an existing IT organization and you're deploying OpenStack to increase the level of flexibility in compute resources. You don't need to deploy Cinder or Designate to do that. Let's take the storage case for a second -- our hypothetical IT organization probably already has some form of storage -- a SAN, or NFS appliances, or something like that. So stateful cloud is easy for them -- they just have their instances mount resources from those existing storage pools like they would any other machine. Eventually they'll decide that hand managing that is horrible and move to Cinder, but that's probably later once they've gotten through the initial baby step of deploying Nova, Glance and Keystone.

    The first step to using layers to decide what we should focus on is to decide what is in each layer. I think the conversation needs to revolve around that for now, because it we drift off into whether existing in a given layer means you're voted off the OpenStack island, when we'll never even come up with a set of agreed layers.

    Let's ignore tents for now

    The size of the OpenStack "tent" is the metaphor being used at the moment for working out what to include in OpenStack. As I say above, I think we need to reach agreement on what is in each layer before we can move on to that very important conversation.

    Conclusion

    Given the focus of this post is the layers model, I want to stop introducing new concepts here for now. Instead let me summarize where I stand so far -- I think the layers model is useful. I also think the layers should be an inverted pyramid -- layer 1 should be as small as possible for example. This is because of the dependency model that the layers model proposes -- it is important to keep the list of things that a layer 2 service must use as small and coherent as possible. Another reason to keep the lower layers as small as possible is because each layer represents the smallest possible increment of an OpenStack deployment that we think is reasonable. We believe it is currently reasonable to deploy Nova without Cinder or Neutron for example.

    Most importantly of all, having those incremental stages of OpenStack deployment gives us a framework we have been missing in talking to our deployers and users. It makes OpenStack less confusing to outsiders, as it gives them bite sized morsels to consume one at a time.

    So here are the layers as I see them for now:

    • layer 0: operating system, and Oslo
    • layer 1: basic services -- Keystone, Glance, Nova, and Swift
    • layer 2: extended basics -- Neutron, Cinder, and Ironic
    • layer 3: optional services -- Horizon, and Ceilometer
    • layer 4: application services -- Heat, Trove, Designate, and Zaqar


    I am not saying that everything inside a single layer is required to be deployed simultaneously, but I do think its reasonable for Ceilometer to assume that Swift is installed and functioning. The big difference here between my view of layers and that of Dean, Sean and Monty is that I think that Swift is a layer 1 service -- it provides basic functionality that may be assumed to exist by services above it in the model.

    I believe that when projects come to the Technical Committee requesting incubation or integration, they should specify what layer they see their project sitting at, and the justification for a lower layer number should be harder than that for a higher layer. So for example, we should be reasonably willing to accept proposals at layer 4, whilst we should be super concerned about the implications of adding another project at layer 1.

    In the next post in this series I'll try to address the size of the OpenStack "tent", and what projects we should be focussing on.

    Tags for this post: openstack kilo technical committee tc layers
    Related posts: One week of Nova Kilo specifications; How are we going with Nova Kilo specs after our review day?; Specs for Kilo; Juno TC Candidacy; Compute Kilo specs are open; Specs for Kilo

posted at: 18:57 | path: /openstack/kilo | permanent link to this entry


Blueprints implemented in Nova during Juno

posted at: 13:56 | path: /openstack/juno | permanent link to this entry


Mon, 29 Sep 2014



Chronological list of Juno Nova mid-cycle meetup posts

posted at: 23:10 | path: /openstack/juno | permanent link to this entry


My candidacy for Kilo Compute PTL

    This is mostly historical at this point, but I forgot to post it here when I emailed it a week or so ago. So, for future reference:

    I'd like another term as Compute PTL, if you'll have me.
    
    We live in interesting times. openstack has clearly gained a large
    amount of mind share in the open cloud marketplace, with Nova being a
    very commonly deployed component. Yet, we don't have a fantastic
    container solution, which is our biggest feature gap at this point.
    Worse -- we have a code base with a huge number of bugs filed against
    it, an unreliable gate because of subtle bugs in our code and
    interactions with other openstack code, and have a continued need to
    add features to stay relevant. These are hard problems to solve.
    
    Interestingly, I think the solution to these problems calls for a
    social approach, much like I argued for in my Juno PTL candidacy
    email. The problems we face aren't purely technical -- we need to work
    out how to pay down our technical debt without blocking all new
    features. We also need to ask for understanding and patience from
    those feature authors as we try and improve the foundation they are
    building on.
    
    The specifications process we used in Juno helped with these problems,
    but one of the things we've learned from the experiment is that we
    don't require specifications for all changes. Let's take an approach
    where trivial changes (no API changes, only one review to implement)
    don't require a specification. There will of course sometimes be
    variations on that rule if we discover something, but it means that
    many micro-features will be unblocked.
    
    In terms of technical debt, I don't personally believe that pulling
    all hypervisor drivers out of Nova fixes the problems we face, it just
    moves the technical debt to a different repository. However, we
    clearly need to discuss the way forward at the summit, and come up
    with some sort of plan. If we do something like this, then I am not
    sure that the hypervisor driver interface is the right place to do
    that work -- I'd rather see something closer to the hypervisor itself
    so that the Nova business logic stays with Nova.
    
    Kilo is also the release where we need to get the v2.1 API work done
    now that we finally have a shared vision for how to progress. It took
    us a long time to get to a good shared vision there, so we need to
    ensure that we see that work through to the end.
    
    We live in interesting times, but they're also exciting as well.
    


    I have since been elected unopposed, so thanks for that!

    Tags for this post: openstack kilo compute ptl
    Related posts: One week of Nova Kilo specifications; How are we going with Nova Kilo specs after our review day?; Specs for Kilo; Thoughts from the PTL; On layers; Expectations of core reviewers

posted at: 18:34 | path: /openstack/kilo | permanent link to this entry


Fri, 26 Sep 2014



The Decline and Fall of IBM: End of an American Icon?

posted at: 00:39 | path: /book/Robert_Cringely | permanent link to this entry


Thu, 21 Aug 2014



Juno nova mid-cycle meetup summary: conclusion

posted at: 23:47 | path: /openstack/juno | permanent link to this entry


Juno nova mid-cycle meetup summary: the next generation Nova API

    This is the final post in my series covering the highlights from the Juno Nova mid-cycle meetup. In this post I will cover our next generation API, which used to be called the v3 API but is largely now referred to as the v2.1 API. Getting to this point has been one of the more painful processes I think I've ever seen in Nova's development history, and I think we've learnt some important things about how large distributed projects operate along the way. My hope is that we remember these lessons next time we hit something as contentious as our API re-write has been.

    Now on to the API itself. It started out as an attempt to improve our current API to be more maintainable and less confusing to our users. We deliberately decided that we would not focus on adding features, but instead attempt to reduce as much technical debt as possible. This development effort went on for about a year before we realized we'd made a mistake. The mistake we made is that we assumed that our users would agree it was trivial to move to a new API, and that they'd do that even if there weren't compelling new features, which it turned out was entirely incorrect.

    I want to make it clear that this wasn't a mistake on the part of the v3 API team. They implemented what the technical leadership of Nova at the time asked for, and were very surprised when we discovered our mistake. We've now spent over a release cycle trying to recover from that mistake as gracefully as possible, but the upside is that the API we will be delivering is significantly more future proof than what we have in the current v2 API.

    At the Atlanta Juno summit, it was agreed that the v3 API would never ship in its current form, and that what we would instead do is provide a v2.1 API. This API would be 99% compatible with the current v2 API, with the incompatible things being stuff like if you pass a malformed parameter to the API we will now tell you instead of silently ignoring it, which we call 'input validation'. The other thing we are going to add in the v2.1 API is a system of 'micro-versions', which allow a client to specify what version of the API it understands, and for the server to gracefully degrade to older versions if required.

    This micro-version system is important, because the next step is to then start adding the v3 cleanups and fixes into the v2.1 API, but as a series of micro-versions. That way we can drag the majority of our users with us into a better future, without abandoning users of older API versions. I should note at this point that the mechanics for deciding what the minimum micro-version a version of Nova will support are largely undefined at the moment. My instinct is that we will tie to stable release versions in some way; if your client dates back to a release of Nova that we no longer support, then we might expect you to upgrade. However, that hasn't been debated yet, so don't take my thoughts on that as rigid truth.

    Frustratingly, the intent of the v2.1 API has been agreed and unchanged since the Atlanta summit, yet we're late in the Juno release and most of the work isn't done yet. This is because we got bogged down in the mechanics of how micro-versions will work, and how the translation for older API versions will work inside the Nova code later on. We finally unblocked this at the mid-cycle meetup, which means this work can finally progress again.

    The main concern that we needed to resolve at the mid-cycle was the belief that if the v2.1 API was implemented as a series of translations on top of the v3 code, then the translation layer would be quite thick and complicated. This raises issues of maintainability, as well as the amount of code we need to understand. The API team has now agreed to produce an API implementation that is just the v2.1 functionality, and will then layer things on top of that. This is actually invisible to users of the API, but it leaves us with an implementation where changes after v2.1 are additive, which should be easier to maintain.

    One of the other changes in the original v3 code is that we stopped proxying functionality for Neutron, Cinder and Glance. With the decision to implement a v2.1 API instead, we will need to rebuild that proxying implementation. To unblock v2.1, and based on advice from the HP and Rackspace public cloud teams, we have decided to delay implementing these proxies. So, the first version of the v2.1 API we ship will not have proxies, but later versions will add them in. The current v2 API implementation will not be removed until all the proxies have been added to v2.1. This is prompted by the belief that many advanced API users don't use the Nova API proxies, and therefore could move to v2.1 without them being implemented.

    Finally, I want to thank the Nova API team, especially Chris Yeoh and Kenichi Oomichi for their patience with us while we have worked through these complicated issues. It's much appreciated, and I find them a consistent pleasure to work with.

    That brings us to the end of my summary of the Nova Juno mid-cycle meetup. I'll write up a quick summary post that ties all of the posts together, but apart from that this series is now finished. Thanks for following along.

    Tags for this post: openstack juno nova mid-cycle summary api v3 v2.1
    Related posts: Juno nova mid-cycle meetup summary: nova-network to Neutron migration; Chronological list of Juno Nova mid-cycle meetup posts; Juno nova mid-cycle meetup summary: conclusion; Juno nova mid-cycle meetup summary: social issues; Juno nova mid-cycle meetup summary: scheduler; Juno nova mid-cycle meetup summary: containers

posted at: 16:52 | path: /openstack/juno | permanent link to this entry


Don't Tell Mum I Work On The Rigs

posted at: 13:45 | path: /book/Paul_Carter | permanent link to this entry


Tue, 19 Aug 2014



Juno nova mid-cycle meetup summary: nova-network to Neutron migration

    This will be my second last post about the Juno Nova mid-cycle meetup, which covers the state of play for work on the nova-network to Neutron upgrade.

    First off, some background information. Neutron (formerly Quantum) was developed over a long period of time to replace nova-network, and added to the OpenStack Folsom release. The development of new features for nova-network was frozen in the Nova code base, so that users would transition to Neutron. Unfortunately the transition period took longer than expected. We ended up having to unfreeze development of nova-network, in order to fix reliability problems that were affecting our CI gating and the reliability of deployments for existing nova-network users. Also, at least two OpenStack companies were carrying significant feature patches for nova-network, which we wanted to merge into the main code base.

    You can see the announcement at http://lists.openstack.org/pipermail/openstack-dev/2014-January/025824.html. The main enhancements post-freeze were a conversion to use our new objects infrastructure (and therefore conductor), as well as features that were being developed by Nebula. I can't find any contributions from the other OpenStack company in the code base at this time, so I assume they haven't been proposed.

    The nova-network to Neutron migration path has come to the attention of the OpenStack Technical Committee, who have asked for a more formal plan to address Neutron feature gaps and deprecate nova-network. That plan is tracked at https://wiki.openstack.org/wiki/Governance/TechnicalCommittee/Neutron_Gap_Coverage. As you can see, there are still some things to be merged which are targeted for juno-3. At the time of writing this includes grenade testing; Neutron being the devstack default; a replacement for nova-network multi-host; a migration plan; and some documentation. They are all making good progress, but until these action items are completed, Nova can't start the process of deprecating nova-network.

    The discussion at the Nova mid-cycle meetup was around the migration planning item in the plan. There is a Nova specification that outlines one possible plan for live upgrading instances (i.e, no instance downtime) at https://review.openstack.org/#/c/101921/, but this will probably now be replaced with a simpler migration path involving cold migrations. This is prompted by not being able to find a user that absolutely has to have live upgrade. There was some confusion, because of a belief that the TC was requiring a live upgrade plan. But as Russell Bryant says in the meetup etherpad:

    "Note that the TC has made no such statement on migration expectations other than a migration path must exist, both projects must agree on the plan, and that plan must be submitted to the TC as a part of the project's graduation review (or project gap review in this case). I wouldn't expect the TC to make much of a fuss about the plan if both Nova and Neutron teams are in agreement."


    The current plan is to go forward with a cold upgrade path, unless a user comes forward with an absolute hard requirement for a live upgrade, and a plan to fund developers to work on it.

    At this point, it looks like we are on track to get all of the functionality we need from Neutron in the Juno release. If that happens, we will start the nova-network deprecation timer in Kilo, with my expectation being that nova-network would be removed in the "M" release. There is also an option to change the default networking implementation to Neutron before the deprecation of nova-network is complete, which will mean that new deployments are defaulting to the long term supported option.

    In the next (and probably final) post in this series, I'll talk about the API formerly known as Nova API v3.

    Tags for this post: openstack juno nova mid-cycle summary nova-network neutron migration
    Related posts: Chronological list of Juno Nova mid-cycle meetup posts; Juno nova mid-cycle meetup summary: conclusion; Juno nova mid-cycle meetup summary: social issues; Juno nova mid-cycle meetup summary: scheduler; Juno nova mid-cycle meetup summary: containers; Juno nova mid-cycle meetup summary: ironic

posted at: 20:37 | path: /openstack/juno | permanent link to this entry


Juno nova mid-cycle meetup summary: slots

    If I had to guess what would be a controversial topic from the mid-cycle meetup, it would have to be this slots proposal. I was actually in a Technical Committee meeting when this proposal was first made, but I'm told there were plenty of people in the room keen to give this idea a try. Since the mid-cycle Joe Gordon has written up a more formal proposal, which can be found at https://review.openstack.org/#/c/112733.

    If you look at the last few Nova releases, core reviewers have been drowning under code reviews, so we need to control the review workload. What is currently happening is that everyone throws up their thing into Gerrit, and then each core tries to identify the important things and review them. There is a list of prioritized blueprints in Launchpad, but it is not used much as a way of determining what to review. The result of this is that there are hundreds of reviews outstanding for Nova (500 when I wrote this post). Many of these will get a review, but it is hard for authors to get two cores to pay attention to a review long enough for it to be approved and merged.

    If we could rate limit the number of proposed reviews in Gerrit, then cores would be able to focus their attention on the smaller number of outstanding reviews, and land more code. Because each review would merge faster, we believe this rate limiting would help us land more code rather than less, as our workload would be better managed. You could argue that this will mean we just say 'no' more often, but that's not the intent, it's more about bringing focus to what we're reviewing, so that we can get patches through the process completely. There's nothing more frustrating to a code author than having one +2 on their code and then hitting some merge freeze deadline.

    The proposal is therefore to designate a number of blueprints that can be under review at any one time. The initial proposal was for ten, and the term 'slot' was coined to describe the available review capacity. If your blueprint was not allocated a slot, then it would either not be proposed in Gerrit yet, or if it was it would have a procedural -2 on it (much like code reviews associated with unapproved specifications do now).

    The number of slots is arbitrary at this point. Ten is our best guess of how much we can dilute core's focus without losing efficiency. We would tweak the number as we gained experience if we went ahead with this proposal. Remember, too, that a slot isn't always a single code review. If the VMWare refactor was in a slot for example, we might find that there were also ten code reviews associated with that single slot.

    How do you determine what occupies a review slot? The proposal is to groom the list of approved specifications more carefully. We would collaboratively produce a ranked list of blueprints in the order of their importance to Nova and OpenStack overall. As slots become available, the next highest ranked blueprint with code ready for review would be moved into one of the review slots. A blueprint would be considered 'ready for review' once the specification is merged, and the code is complete and ready for intensive code review.

    What happens if code is in a slot and something goes wrong? Imagine if a proposer goes on vacation and stops responding to review comments. If that happened we would bump the code out of the slot, but would put it back on the backlog in the location dictated by its priority. In other words there is no penalty for being bumped, you just need to wait for a slot to reappear when you're available again.

    We also talked about whether we were requiring specifications for changes which are too simple. If something is relatively uncontroversial and simple (a better tag for internationalization for example), but not a bug, it falls through the cracks of our process at the moment and ends up needing to have a specification written. There was talk of finding another way to track this work. I'm not sure I agree with this part, because a trivial specification is a relatively cheap thing to do. However, it's something I'm happy to talk about.

    We also know that Nova needs to spend more time paying down its accrued technical debt, which you can see in the huge amount of bugs we have outstanding at the moment. There is no shortage of people willing to write code for Nova, but there is a shortage of people fixing bugs and working on strategic things instead of new features. If we could reserve slots for technical debt, then it would help us to get people to work on those aspects, because they wouldn't spend time on a less interesting problem and then discover they can't even get their code reviewed. We even talked about having an alternating focus for Nova releases; we could have a release focused on paying down technical debt and stability, and then the next release focused on new features. The Linux kernel does something quite similar to this and it seems to work well for them.

    Using slots would allow us to land more valuable code faster. Of course, it also means that some patches will get dropped on the floor, but if the system is working properly, those features will be ones that aren't important to OpenStack. Considering that right now we're not landing many features at all, this would be an improvement.

    This proposal is obviously complicated, and everyone will have an opinion. We haven't really thought through all the mechanics fully, yet, and it's certainly not a done deal at this point. The ranking process seems to be the most contentious point. We could encourage the community to help us rank things by priority, but it's not clear how that process would work. Regardless, I feel like we need to be more systematic about what code we're trying to land. It's embarrassing how little has landed in Juno for Nova, and we need to be working on that. I would like to continue discussing this as a community to make sure that we end up with something that works well and that everyone is happy with.

    This series is nearly done, but in the next post I'll cover the current status of the nova-network to neutron upgrade path.

    Tags for this post: openstack juno nova mid-cycle summary review slots blueprint priority project management
    Related posts: Juno nova mid-cycle meetup summary: social issues; Juno nova mid-cycle meetup summary: nova-network to Neutron migration; Juno nova mid-cycle meetup summary: scheduler; Juno nova mid-cycle meetup summary: ironic; Chronological list of Juno Nova mid-cycle meetup posts; Juno nova mid-cycle meetup summary: conclusion

posted at: 00:34 | path: /openstack/juno | permanent link to this entry