This quarter I'm going to be joining Alice in trying to improve the system for adding new suites to Talos. The current system involves a lot of hackery on our side and slows down the ability for us to get Talos suites up and running as quickly as might be desired.
So with John's help to create a prioritized list of suite requests, we will be doing a lot of communicating with developers in the coming months to get them up and to improve the process and documentation at the same time. Currently there are 10 new suite requests waiting that are known and there may be others.
Part of the issue with adding new suites is that there is a lack of documentation and tools for developers. Our new system will look more like this:
* A request is made for a new suite and a developer is attached to the request who will be the lead person for working with us to get the suite into production
* The dev will be able to use tools we provide (standalone talos, corral of staging-talos slaves) to do proof of concept on the suite so that it works and is ready to go up in staging when it's handed over to RelEng
* RelEng will enable the test suite in staging and verify that changes in staging work fine with the other existing jobs being run on the same machines. Once all is well, then rollout to production would happen
As we progress through the suite requests, this process should get easier for all parties and more streamlined. We hope that by the time we reach suite #10 it will be much easier and faster for developers and RelEng to get the proposed new Talos suites into production.
I mentioned the developers will have tools provided by us. We need to do a bit of work to make these tools usable by developers and the first place to start is with our documentation of what Talos is and how it works. Following this we will have discussed having boilerplate code for creating each of the two styles of tests startup or pageload. Also, it might be beneficial to have a coral of Talos machines that can be loaned out to a dev for a limited time in order to test a suite during creation and debugging. This coral could then be re-imaged and passed along to the next suite developer.
Here is the current documentation page. Doesn't give you much to go on, right?
Well this is about to change. Given my complete lack of Talos knowledge, I will be writing up what I learn about Talos as it's happening so that hopefully a more complete set of docs will exist for the Talos neophyte and folks who want to work with us to add new suites will benefit from this as well.
Here's the current list of the docs to be created based on what we think you might want to know:
* How Talos works and an overview of the development from past to present
* What preferences Talos runs with
* A description of each test suite, what each runs
* What the numbers mean
These are the things I don't know - is there anything you don't see listed here that you want to know more about? Feel free to make suggestions in the comments.